Many problems in natural language processing require structured representations of unbounded size. Syntactic parsing has been the canonical example of such a problem (parse trees can be arbitrarily large), but there are many others. On the other hand, most methods in machine learning and statistics are designed for unstructured finite representations. One popular example is log-linear models (a.k.a. maximum entropy models), which estimate probabilities for fixed length feature vectors. Much of the work on statistical parsing can be characterized in terms of its choice of mapping from unbounded syntactic structures to fixed length feature vectors. I will present several approaches to defining this mapping, namely hand-crafted feature sets, kernel methods, feed-forward neural networks, and recurrent neural networks. These methods differ in the amount of task-specific knowledge which must be built in, the generality of the models which can be learned, and the performance which has been achieved.