==Motor: gestures / language== @article{ozcaliskan-goldinMeadow-10_sex-differences-in-gesture-presage-language, title={{Sex differences in language first appear in gesture}}, author={Ozcaliskan, Seyda and Susan Goldin-Meadow}, journal={Developmental Science}, volume={13}, number={5}, pages={752--760}, issn={1363-755X}, year={2010}, publisher={Wiley-Blackwell} annote = { for some years now, ozcaliskan has been studying meaningful gestures by children, and in a 2005 paper, she and Goldin-Meadow showed that gesture-word combinations (e.g. point at cookie and say "eat"), precede multi-word utterances by several months. here the same experimental setup (longitudinal observations) is extended to gender imbalances. it is known that on average infant boys utter 2-word constructs some 3 months later than girls. this longitudinal study videotaped 40 children (22 girls, 18 boys) at home every 4 months between 14 and 34 months, (90 min x 6 per child). by analyzing the videos, it was found that boys are also later in G+S combinations, again by about 3 months; e.g. at 14 mos, the avg girl child has a gestural vocab of 65 tokens vs 40 tokens for the boy. By 18 months, boys will reach 74 tokens on avg, but girls may be at 106. while this is interesting, it is not really v surprising. re: the methodology however, given the wide variability in ages at which children acquire language, it is not clear if a sample bias was really avoided in such a study based on forty children altogether. The high SDs on the data reflect this variation. --- Abstract: Children differ in how quickly they reach linguistic milestones. Boys typically produce their first multi-word sentences later than girls do. We ask here whether there are sex differences in children’s gestures that precede, and presage, these sex differences in speech. To explore this question, we observed 22 girls and 18 boys every 4 months as they progressed from one-word speech to multi-word speech. We found that boys not only produced speech + speech (S+S) combinations (‘drink juice’) 3 months later than girls, but they also produced gesture + speech (G+S) combinations expressing the same types of semantic relations (‘eat’ + point at cookie) 3 months later than girls. Because G+S combinations are produced earlier than S+S combinations, children’s gestures provide the first sign that boys are likely to lag behind girls in the onset of sentence constructions. unicode: Şeyda Özçalışkan }} ==Perception: Objects== @article{porway-yaoB-zhuSC-08_learning-compositional-models-for-object-categories, title={Learning compositional models for object categories from small sample sets}, author={Porway, J. and Yao, B. and Zhu, S.C.}, journal={Object categorization: computer and human vision perspectives}, year={2008} Abstract: In this chapter we present a method for learning a compositional model in a minimax entropy framework for modeling object categories with large intra-class variance. The model we learn incorporates the flexibility of a stochastic context free grammar (SCFG) to account for the variation in object structure with the neighborhood constraints of a Markov random field (MRF) to enforce spatial context. We learn the model through a generalized minimax entropy framework that accounts for the dynamic structure of the hierarchical model. We first learn the SCFG parameters using the frequencies of object parts, then pursue spatial relations in order of greatest information gain. The learned model can generalize from a small set of training samples (n < 100) to generate a combinatorially large number of novel instances using stochastic sampling. This process is similar to "recognition-by-components", a theory that postulates that biological vision systems recognize objects as composed from a dictionary of commonly appear- ing 3D structures. This paper provides one possible implementation of this theory. To verify our learning method and model performance, we present plots of KL divergence minimization as the algorithm proceeds, and show realistic samples drawn from the model. We also show the model accurately predicting missing or undetected parts for top-down recognition along with preliminary results showing that the model can learn a large space of category appearances from a very small (n < 15) number of training samples. Finally, we discuss a compositional boosting algorithm for inference and show examples using it for object recognition. }} @inproceedings{kulkarni-berg-11cvpr_baby-talk-image-descriptions, title={Baby talk: Understanding and generating simple image descriptions}, author={Kulkarni, G. and Premraj, V. and Dhar, S. and Li, S. and Choi, Y. and Berg, A.C. and Berg, T.L.}, booktitle={Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on}, pages={1601--1608}, year={2011}, annote = { We posit that visually descriptive language offers computer vision researchers both information about the world, and information about how people describe the world. The potential benefit from this source is made more significant due to the enormous amount of language data easily available today. We present a system to automatically generate natural language descriptions from images that exploits both statistics gleaned from parsing large quantities of text data and recognition algorithms from computer vision. The system is very effective at producing relevant sentences for images. It also generates descriptions that are notably more true to the specific image content than previous work. }} ==Perception: Action== @article{georgeon-ritter-11_intrinsically-motivated-schema-for-emergent-cognition, author = {Olivier Georgeon and Frank Ritter} title = {An Intrinsically-Motivated Schema Mechanism to Model and Simulate Emergent Cognition}, journal = {Cognitive Systems Research} annote = { considers an agent, Ernest, that moves in a maze-like environment. it uses intrinsic motivation (maximize built-in utility functions), and learns hierarchical action "schemas" in a Piagetian, constructivist manner. the simulator code is available. ?project? simulator: http://liris.cnrs.fr/ideal/doc/GeorgeonO2011-emergent-cognition.pdf http://e-ernest.blogspot.com/ abstract: We introduce an approach to simulate the early mechanisms of emergent cognition based on theories of enactive cognition and on constructivist epistemology. The agent has intrinsic motivations implemented as inborn proclivities that drive the agent in a proactive way. Following these drives, the agent autonomously learns regularities afforded by the environment, and hierarchical sequences of behaviors adapted to these regularities. The agent represents its current situation in terms of perceived affordances that develop through the agent’s experience. This situational representation works as an emerging situation awareness that is grounded in the agent’s interaction with its environment and that in turn generates expectations and activates adapted behaviors. Through its activity and these aspects of behavior (behavioral proclivity, situation awareness, and hierarchical sequential learning), the agent starts to exhibit emergent sensibility, intrinsic motivation, and autonomous learning. Following theories of cognitive development, we argue that this initial autonomous mechanism provides a basis for implementing autonomously developing cognitive systems. }} @inproceedings{swift-kautz-12_multimodal-corpus-language-action, title={A multimodal corpus for integrated language and action}, author={Swift, M. and Ferguson, G. and Galescu, L. and Chu, Y. and Harman, C. and Jung, H. and Perera, I. and Song, Y.C. and Allen, J. and Kautz, H.}, booktitle={Proc. of the Int. Workshop on MultiModal Corpora for Machine Learning}, year={2012} annote = { make tea : 12 subjects x 3 episodes [e.g. take a tea bag from the cupboard - "tea bag" can refer to the indiv teabag, or to the teabag box] use RFID emitters on objects - but didn't have enough sensitivity. make sandwiches snack bar activity --> want to label the data We describe a corpus for research on learning everyday tasks in natural environments using the combination of natural language description and rich sensor data that we have collected for the CAET (Cognitive Assistant for Everyday Tasks) project. We have collected audio, video, Kinect RGB-Depth video and RFID object-touch data while participants demonstrate how to make a cup of tea. The raw data are augmented with gold-standard annotations for the language representation and the actions performed. We augment activity observation with natural language instruction to assist in task learning. }} @inProceedings{tokunaga-iidra-10_bilingual-multimodal-corpora-referring-expressions, title={Construction of bilingual multimodal corpora of referring expressions in collaborative problem solving}, author={Tokunaga, T. and Iidra, R. and Yasuhara, M. and Terai, A. and Morris, D. and Belz, A.}, year={2010}, annote = { Over the last decade, with a growing recognition that referring expressions frequently appear in collaborative task dialogues (Clark and WilkesGibbs, 1986; Heeman and Hirst, 1995), a number of corpora have been constructed to study the nature of their use. This tendency also reflects the recognition that this area yields both challenging research topics as well as promising applications such as human-robot interaction (Foster et al., 2008; Kruijff et al., 2010). The COCONUT corpus (Di Eugenio et al., 2000) was collected from keyboard-dialogs between two participants, who worked together on a simple 2-D design task, buying and arranging furniture for two rooms. The COCONUT corpus is limited in annotations which describe symbolic object information such as object intrinsic attributes and location in discrete co-ordinates. As an initial work of constructing a corpus for collaborative tasks, the COCONUT corpus can be characterised as having a rather simple domain as well The QUAKE corpus (Byron, 2005) and its successor, the SCARE corpus (Stoia et al., 2008) deal with a more complex domain, where two participants collaboratively play a treasure hunting game in a 3-D virtual world. Despite the complexity of the domain, the participants were only allowed limited actions, e.g. moving step forward, pushing a button etc. As a part of the JAST project, the Joint Construction Task (JCT) corpus was created based on dialogues in which two participants constructed a puzzle (Foster et al., 2008). The setting of the experiment is quite similar to ours except that both participants have even roles. Since our main concern is referring expressions, we believe our asymmetric setting elicits more referring expressions than the symmetric setting of the JCT corpus. In contrast to these previous corpora, our corpora record a wide range of information useful for analysis of human reference behaviour in situated dialogue. While the domain of our corpora is simple compared to the QUAKE and SCARE corpora, we allowed a comparatively large flexibility in the actions necessary for achieving the goal shape (i.e. flipping, turning and moving of puzzle pieces at different degrees), relative to the complexity of the domain. Providing this relatively larger freedom of actions to the participants together with the recording of detailed information allows for research into new aspects of referring expressions. As for a multilingual aspect, all the above corpora are English. There have been several recent attempts at collecting multilingual corpora in situated domains. For instance, (Gargett et al., 2010) collected German and English corpora in the same setting. Their domain is similar to the QUAKE corpus. Van der Sluis et al. (2009) aim at a comparative study of referring expressions between English and Japanese. Their domain is still static at the moment. Our corpora aim at dealing with the dynamic nature of situated dialogues between very different languages, English and Japanese. }} @article{loucks-sommerville-12_recognizing-human-actions-4-10mo, title={Developmental changes in the discrimination of dynamic human actions in infancy}, author={Loucks, J. and Sommerville, J.A.}, journal={Developmental Science}, year={2012}, annote = { how does an infant figure out the intentionality of an agent doing some action? Even for a simple action like reaching for and grasping a toy, 5-6 mo infants attend to multiple different properties: which toy the actor selects, the particular grasp used, the spatial trajectory of the reach, how fast the reach is executed, etc. [Woodward (1998)] has 4-mo and 10-mo infants watch an actor in three situations: - move a toy across a table [habituation] - change how the hand contacts the toy [featural change] - change spatial aspects (more global) - such as pose of head or body, or arm trajectory [configurational] - change temporal aspects (speed) though configurational changes were quantitatively larger shifts in the image, adults pay more attention to featural change, indicating that they were sensitive to functional aspects of the situation. [Loucks and Baldwin (2009)] Loucks, J., & Baldwin, D. (2009). Sources of information for discriminating dynamic human actions. Cognition, 111 (1), 84–97. how do we develop such sensitivity to functionally important aspects? One possibility is that infants become increasingly sensitive to both sources of information, but continue to gain sensitivity to featural information when sensitivity to configural information levels off. Another possibility is that infants begin with relatively broad sensitivity to both sources, but lose sensitivity to configural information while maintaining sensitivity to featural information. it is known that by 5-6 months, infants gaze longer at the goal of a motion than the motion per se. thus, the particular toy being picked up is more important than the relative position, or the trajectory [configurational]. [Woodward group 2002-2009] Woodward, A.L. (2009). Infants’ grasp of others’ intentions. Current Directions in Psychological Science, 18 (1), 53–57. The main result is that at 4 months, infants are more sensitive to config and temporal changes than to featural, but by 10 months, their response to configurational or temporal change is about the same as habituation, while looking time for featural is significantly higher. In the discussion, they suggest that the motor acquisition and maturation in the intervening months may be a source for heightened awareness of the functional (featural) aspects of grasp, and its relation to object shape. Subsequent work: 10-month-olds’ understanding of the functional consequences of the precision grasp is correlated with their ability to perform precision grasps themselves (Loucks & Sommerville, in press). (Loucks & Sommerville, in press). The role of motor experience in understanding action function: the case of the precision grasp. Child Development. --Abstract-- Recent evidence suggests that adults selectively attend to features of action, such as how a hand contacts an object, and less to configural properties of action, such as spatial trajectory, when observing human actions. The current research investigated whether this bias develops in infancy. We utilized a habituation paradigm to assess 4-month-old and 10-month-old infants’ discrimination of action based on featural, configural, and temporal sources of action information. Younger infants were able to discriminate changes to all three sources of information, but older infants were only able to reliably discriminate changes to featural information. These results highlight a previously unknown aspect of early action processing, and suggest that action perception may undergo a developmental process akin to perceptual narrowing. }} @article{rakison-krogh-12_causal-action-facilitates-causal-perception-5mo, title={Does causal action facilitate causal perception in infants younger than 6 months of age?}, author={Rakison, D.H. and Krogh, L.}, journal={Developmental Science}, year={2012}, annote = { same perceptual input, but differing situations of causality (stickiness) are simulated through a clever use of velcro. chilren around 5 mos are able to perceive distinctions based on stickiness. Q. Is this learning laws of physics ("velcro sticks") or of the mysterious substance called "causality"? abstract: Previous research has established that infants are unable to perceive causality until 6¼ months of age. The current experiments examined whether infants’ ability to engage in causal action could facilitate causal perception prior to this age. In Experiment 1, 4.5-month-olds were randomly assigned to engage in causal action experience via Velcro sticky mittens or not engage in causal action because they wore non-sticky mittens. Both groups were then tested in the visual habituation paradigm to assess their causal perception. Infants who engaged in causal action – but not those without this causal action experience – perceived the habituation events as causal. Experiment 2 used a similar design to establish that 4.5-month-olds are unable to generalize their own causal action to causality observed in dissimilar objects. These data are the first to demonstrate that infants under 6 months of age can perceive causality, and have implications for the mechanisms underlying the development of causal perception. }} @article{cannon-woodward-11_action-production-influences-attention-12mo, title={Action production influences 12-month-old infants’ attention to others’ actions}, author={Cannon, E.N. and Woodward, A.L. and Gredeb{\"a}ck, G. and von Hofsten, C. and Turek, C.}, journal={Developmental Science}, year={2011}, annote = { priming is observed from what they do to what they see, in 12 month-olds. gaze tracking data. if they did the same task earlier (behavior first), their gaze shifts were reliably quicker (mean latency drops from 71 msec to 0 msec). action anticipation: Falck-Ytter et al. (2006): three balls are grasped by a person and deposited in a bucket. alternately, the balls move their on their own. both adults and 12-mo olds gaze-shift reliably to the goal only when it is done by a human. }, abstract = { Recent work implicates a link between action control systems and action understanding. In this study, we investigated the role of the motor system in the development of visual anticipation of others’ actions. Twelve-month-olds engaged in behavioral and observation tasks. Containment activity, infants’ spontaneous engagement in producing containment actions; and gaze latency, how quickly they shifted gaze to the goal object of another’s containment actions, were measured. Findings revealed a positive relationship: infants who received the behavior task first evidenced a strong correlation between their own actions and their subsequent gaze latency of another’s actions. Learning over the course of trials was not evident. These findings demonstrate a direct influence of the motor system on online visual attention to others’ actions early in development. }} @article{wang-mori-09pami_action-recognition-by-semilatent-topic-models, author={Yang Wang and Mori, G.}, journal={Pattern Analysis and Machine Intelligence, IEEE Transactions on}, title={Human Action Recognition by Semilatent Topic Models}, year={2009}, month={oct. }, volume={31}, number={10}, pages={1762 -1774}, doi={10.1109/TPAMI.2009.43}, annote = { machine learning approach applying ideas from document information retrieval to video data. --abstract-- We propose two new models for human action recognition from video sequences using topic models. Video sequences are represented by a novel “bag-of-words” representation, where each frame corresponds to a “word”. Our models differ from previous latent topic models for visual recognition in two major aspects: first of all, the latent topics in our models directly correspond to class labels; secondly, some of the latent variables in previous topic models become observed in our case. Our models have several advantages over other latent topic models used in visual recognition. First of all, the training is much easier due to the decoupling of the model parameters. Secondly, it alleviates the issue of how to choose the appropriate number of latent topics. Thirdly, it achieves much better performance by utilizing the information provided by the class labels in the training set. We present action classification results on five different datasets. Our results are either comparable to, or significantly better than previous published results on these datasets. }} @article{anderson-chiu-11_temporal-dynamics-language-vision, title={On the temporal dynamics of language-mediated vision and vision-mediated language}, author={Anderson, S.E. and Chiu, E. and Huette, S. and Spivey, M.J.}, journal={Acta psychologica}, volume={137}, number={2}, pages={181--189}, year={2011}, publisher={Elsevier} Recent converging evidence suggests that language and vision interact immediately in non-trivial ways, although the exact nature of this interaction is still unclear. Not only does linguistic information influence visual perception in real-time, but visual information also influences language comprehension in real-time. For example, in visual search tasks, incremental spoken delivery of the target features (e.g., “Is there a red vertical?”) can increase the efficiency of conjunction search because only one feature is heard at a time. Moreover, in spoken word recognition tasks, the visual presence of an object whose name is similar to the word being spoken (e.g., a candle present when instructed to “pick up the candy”) can alter the process of comprehension. Dense sampling methods, such as eye-tracking and reach-tracking, richly illustrate the nature of this interaction, providing a semi-continuous measure of the temporal dynamics of individual behavioral responses. We review a variety of studies that demonstrate how these methods are particularly promising in further elucidating the dynamic competition that takes place between underlying linguistic and visual representations in multimodal contexts, and we conclude with a discussion of the consequences that these findings have for theories of embodied cognition. }} @article{french-mareschal-11_connectionist-sequence-chunks, title={TRACX: a recognition-based connectionist framework for sequence segmentation and chunk extraction.}, author={French, R.M. and Addyman, C. and Mareschal, D.}, journal={Psychological review}, volume={118}, number={4}, pages={614}, year={2011}, annote = { computational simulation of temporal sequence chunking --abstract-- Individuals of all ages extract structure from the sequences of patterns they encounter in their environment, an ability that is at the very heart of cognition. Exactly what underlies this ability has been the subject of much debate over the years. A novel mechanism, implicit chunk recognition (ICR), is proposed for sequence segmentation and chunk extraction. The mechanism relies on the recognition of previously encountered subsequences (chunks) in the input rather than on the prediction of upcoming items in the input sequence. A connectionist autoassociator model of ICR, truncated recursive autoassociative chunk extractor (TRACX), is presented in which chunks are extracted by means of truncated recursion. The performance and robustness of the model is demonstrated in a series of 9 simulations of empirical data, covering a wide range of phenomena from the infant statistical learning and adult implicit learning literatures, as well as 2 simulations demonstrating the model’s ability to generalize to new input and to develop internal representations whose structure reflects that of the items in the input sequence. TRACX outperforms PARSER (Perruchet & Vintner, 1998) and the simple recurrent network (SRN, Cleeremans & McClelland, 1991) in matching human sequence segmentation on existing data. A new study is presented exploring 8-month-olds’ use of backward transitional probabilities to segment auditory sequences. }} ==Language : Grammar learning== @article{waterfall-sandbank-10_computational-language-acquisition, title={An empirical generative framework for computational modeling of language acquisition}, author={Waterfall, H.R. and Sandbank, B. and Onnis, L. and Edelman, S.}, journal={Journal of child language}, volume={37}, pages={671--703}, year={2010}, publisher={Cambridge Univ Press}, annote = { CHILDES: corpus of utterances by children learning various languages, and also that of caregivers. adopts an unsupervised grammar learning system caled ConText to learn from this corpus to see if a system can acquire a grammar for English in an unsupervised manner. ConText, a much simpler algorithm developed in response to ADIOS, operates directly on the distributional statistics of the corpus and characterizes words and phrases by the local linguistic contexts in which they appeared. In ConText, the distributional statistics of a word or a sequence of words (w) are determined by the surrounding words (i.e. local context). The width of this local context, L, is a user-specified parameter, set in most of our experiments to be two words on either side of w. To calculate the distributional statistics of w, ConText constructs its left and right context vectors. [Distributional statistics] has been instrumental for the automatic acquisition of syntactic categories (Redington, Chater & Finch, 1998), the grouping of nouns into semantic categories (Pereira, Tishby & Lee, 1993), unsupervised parsing (Clark, 2001; Klein & Manning, 2002) and text classification (Baker & McCallum, 1998). ConText forms word and phrase categories : E15 → bowl | refrigerator | oven | house | mirror | country | corner | sky | basket | living room | kitchen | barn | bath tub | snow | closet | carriage | world | box | bag | bedroom | car | sink air | water | movie | forest | sand | drawer E32 → eat | drink E68 → warm | hot | cold E104 → hold on | listen here E32 and E104 - both verbs - seem to also incorporate some semantic aspects. abstract = { This paper reports progress in developing a computer model of language acquisition in the form of (1) a generative grammar that is (2) algorithmically learnable from realistic corpus data, (3) viable in its large-scale quantitative performance and (4) psychologically real. First, we describe new algorithmic methods for unsupervised learning of generative grammars from raw CHILDES data and give an account of the generative performance of the acquired grammars. Next, we summarize findings from recent longitudinal and experimental work that suggests how certain statistically prominent structural properties of child-directed speech may facilitate language acquisition. We then present a series of new analyses of CHILDES data indicating that the desired properties are indeed present in realistic child-directed speech corpora. Finally, we suggest how our computational results, behavioral findings, and corpus-based insights can be integrated into a next-generation model aimed at meeting the four requirements of our modeling framework. the approaches to grammar acquisition that are of most interest to us are those that work in a completely unsupervised fashion on completely unannotated corpora – that is, algorithms that start with no explicit knowledge of potential structures and no data beyond the raw text or transcribed speech. Most existing algorithms for grammar induction have not been designed or tested for operation that is realistic in that sense (e.g. the highly successful algorithm of Klein and Manning (2002) learns structures from data annotated for part of speech information). A most notable exception in this respect is the Unsupervised Data-Oriented Parsing (U-DOP) algorithm developed by Bod (2009). The DOP approach uses the tree-substitution grammar formalism, representing the structure of a novel sentence in terms of probabilistically weighted structural analogies to trees gleaned from a training corpus. In the unsupervised version, these trees are obtained by simply listing all the possible binary tree descriptions of sentences in the training corpus. As reported by Bod (2009), the U-DOP algorithm performs well in the task of learning a grammar from CHILDES data annotated with part of speech information, as assessed by comparing the structures it induces to those from a hand-annotated gold-standard syntactic parse of the corpus (its performance on raw CHILDES data is somewhat lower). }} @conference{borensztajn-09_neural-theory-grammar-acquisition, title={The hierarchical prediction network: towards a neural theory of grammar acquisition}, author={Borensztajn, G. and Zuidema, W. and Bod, R.}, booktitle={Proc. of the 31th Annual Meeting of the Cognitive Science Society}, year={2009} annote = { discussion 09: neocortex : six-layered structure - vertical columns - replicated throughout hawkins : memory-prediction framework: information is stored in hierarchical fashion; top levels are more invariant input is processed in a bottom-up fashion, but expectation is top-down topology among the cells --> lead to grammar. input node layer : interacts with the world. e.g. each node is a word compressor node: connected to one or two nodes below it substitution space: represent the data in some n-dim virtual space production: ordered slots --> fire --> attach to the virtual space or compressor nodes Extends normal neural network structures by allowing a substitution operation between the nodes. --abstract-- We develop an approach to automatically identify the most probable multi-word constructions used in children’s utterances, given syntactically annotated utterances from the Brown corpus of CHILDES. The found constructions cover many interesting linguistic phenomena from the language acquisition literature and show a progression from very concrete toward abstract constructions. We show quantitatively that for all children of the Brown corpus grammatical abstraction, defined as the relative number of variable slots in the productive units of their grammar, increases globally with age. }} @article{singh-reznick-12_infant-word-segmentation-longitudinal-8mo, title={Infant word segmentation and childhood vocabulary development: a longitudinal analysis}, author={Singh, L. and Steven Reznick, J. and Xuehua, L.}, journal={Developmental Science}, year={2012}, abstract = { Infants begin to segment novel words from speech by 7.5 months, demonstrating an ability to track, encode and retrieve words in the context of larger units. Although it is presumed that word recognition at this stage is a prerequisite to constructing a vocabulary, the continuity between these stages of development has not yet been empirically demonstrated. ... Results [of 2 expts] demonstrated a strong degree of association between infant word segmentation abilities at 7 months and productive vocabulary size at 24 months. In addition, outcome groups, as defined by median vocabulary size and growth trajectories at 24 months, showed distinct word segmentation abilities as infants. These findings provide the first prospective evidence supporting the predictive validity of infant word segmentation tasks and suggest that they are indeed associated with mature word knowledge. }} @article{junge-Kooijman-12_rapid-word-recognition-at-10mo, title={Rapid recognition at 10 months as a predictor of language development}, author={Junge, C. and Kooijman, V. and Hagoort, P. and Cutler, A.}, journal={Developmental Science}, year={2012}, Infants’ ability to recognize words in continuous speech is vital for building a vocabulary. We here examined the amount and type of exposure needed for 10-month-olds to recognize words. Infants first heard a word, either embedded within an utterance or in isolation, then recognition was assessed by comparing event-related potentials to this word versus a word that they had not heard directly before. Although all 10-month-olds showed recognition responses to words first heard in isolation, not all infants showed such responses to words they had first heard within an utterance. Those that did succeed in the latter, harder, task, however, understood more words and utterances when re-tested at 12 months, and understood more words and produced more words at 24 months, compared with those who had shown no such recognition response at 10 months. The ability to rapidly recognize the words in continuous utterances is clearly linked to future language development. }} @article{kaminski-schulz-12_how-dogs-know-when-addressed, title={How dogs know when communication is intended for them}, author={Kaminski, J. and Schulz, L. and Tomasello, M.}, journal={Developmental Science}, year={2012}, abstract = { Domestic dogs comprehend human gestural communication in a way that other animal species do not. But little is known about the specific cues they use to determine when human communication is intended for them. In a series of four studies, we confronted both adult dogs and young dog puppies with object choice tasks in which a human indicated one of two opaque cups by either pointing to it or gazing at it. We varied whether the communicator made eye contact with the dog in association with the gesture (or whether her back was turned or her eyes were directed at another recipient) and whether the communicator called the dog’s name (or the name of another recipient). Results demonstrated the importance of eye contact in human–dog communication, and, to a lesser extent, the calling of the dog’s name – with no difference between adult dogs and young puppies – which are precisely the communicative cues used by human infants for identifying communicative intent. Unlike human children, however, dogs did not seem to comprehend the human’s communicative gesture when it was directed to another human, perhaps because dogs view all human communicative acts as directives for the recipient. }} @article{hochmann-etal-11_consonants-help-word-recog-vowels-structure-12mo, title={Consonants and vowels: different roles in early language acquisition}, author={Hochmann, J.R. and Benavides-Varela, S. and Nespor, M. and Mehler, J.}, journal={Developmental Science}, year={2011}, abstract = { Language acquisition involves both acquiring a set of words (i.e. the lexicon) and learning the rules that combine them to form sentences (i.e. syntax). Here, we show that consonants are mainly involved in word processing, whereas vowels are favored for extracting and generalizing structural relations. We demonstrate that such a division of labor between consonants and vowels plays a role in language acquisition. In two very similar experimental paradigms, we show that 12-month-old infants rely more on the consonantal tier when identifying words (Experiment 1), but are better at extracting and generalizing repetition-based srtuctures over the vocalic tier (Experiment 2). These results indicate that infants are able to exploit the functional differences between consonants and vowels at an age when they start acquiring the lexicon, and suggest that basic speech categories are assigned to different learning mechanisms that sustain early language acquisition. Infants are able to use statistical information, such as dips in transition probabilities (TPs) between syllables to identify word boundaries in a continuous speech stream (Saffran, Aslin & Newport, 1996). }} @article{butler-patterson-12_semantic-effects-on-past-tense-inflection, title={In search of meaning: Semantic effects on past-tense inflection}, author={Butler, R. and Patterson, K. and Woollams, A.M.}, year={2012}, journal = {Quarterly Journal of Experimental Psychology}, Volume 65, Issue 8, 2012 annote = { Within single-mechanism connectionist models of inflectional morphology, generating the past-tense form of a verb depends upon the interaction of semantic and phonological representations, with semantic information being particularly important for irregular or exception verbs. We assessed this hypothesis in two experiments requiring normal speakers to produce the past tense from a verb stem that takes a regular or exceptional past tense. Experiment 1 revealed significant latency advantages for high- over low-imageability words for both regular verbs (e.g., “lunged” faster than “loved”) and exception items (e.g., “drank” faster than “dealt”); but critically, this effect was significantly larger for exceptions than for regulars. Experiment 2 employed a semantic priming paradigm where participants inflected verb stems (e.g., sit) preceded by related (e.g., chair) or unrelated primes (e.g., jug) and revealed a priming effect in accuracy that was confined to the exception items. Our results are consistent with predictions from single-mechanism connectionist models of inflectional morphology and converge with findings from neurological patients and studies of reading aloud. }} @inproceedings{kim-mooney-12_unsupervised-pcfg-induction-grounded-language, title={Unsupervised PCFG Induction for Grounded Language Learning with Highly Ambiguous Supervision}, author={Kim, J. and Mooney, R.J.}, booktitle={Proceedings of the Conference on Empirical Methods in Natural Language Processing and Natural Language Learning, EMNLP-CoNLL}, volume={12}, year={2012}, annote = { “Grounded” language learning employs training data in the form of sentences paired with relevant but ambiguous perceptual contexts. B¨orschinger et al. (2011) introduced an approach to grounded language learning based on unsupervised PCFG induction. Their approach works well when each sentence potentially refers to one of a small set of possible meanings, such as in the sportscasting task. However, it does not scale to problems with a large set of potential meanings for each sentence, such as the navigation instruction following task studied by Chen and Mooney (2011). This paper presents an enhancement of the PCFG approach that scales to such problems with highly-ambiguous supervision. Experimental results on the navigation task demonstrates the effectiveness of our approach. }} ==Language : Word learning / grounding== @article{matuszek-zettlemoyer-12_language-perception-grounded-attribute-learning, title={A Joint Model of Language and Perception for Grounded Attribute Learning}, author={Cynthia Matuszek and FitzGerald, N. and Zettlemoyer, L. and Bo, L. and Fox, D.}, journal={Arxiv preprint arXiv:1206.6423}, year={2012}, annote = { Cynthia Matuszek: Learning Novel Attributes from Combined Language and Perception learn cognition from observation show some objects, have people describe "can you describe these objects to us" try to parse descriptions to obtain semantic grounding. mainly colour and shape. mechanical turk data collection: have people describe incentive for minimizing description lengths 15 people (8 male, 7 female) "this is a green object" --> lambda x green(x) orange ball - combination of colour and shape. doesn't know that "orange" is class colour - tries both. input problems: someone says: "this is a fake lettuce, don't eat it" --abstract-- As robots become more ubiquitous and capable of performing complex tasks, the importance of enabling untrained users to interact with them has increased. In response, unconstrained natural-language interaction with robots has emerged as a significant research area. We discuss the problem of parsing natural language commands to actions and control structures that can be readily implemented in a robot execution system. Our approach learns a parser based on example pairs of English commands and corresponding control language expressions. We evaluate this approach in the context of following route instructions through an indoor environment, and demonstrate that our system can learn to translate English commands into sequences of desired actions, while correctly capturing the semantic intent of statements involving complex control structures. The procedural nature of our formal representation allows a robot to interpret route instructions online while moving through a previously unknown environment. }} @article{muncer-knight-12_bigram-trough-syllable-effect-in-lexical-decision, title={The bigram trough hypothesis and the syllable number effect in lexical decision}, author={Muncer, S.J. and Knight, D.C.}, year={2012}, journal = {Quarterly Journal of Experimental Psychology}, Volume 65, Issue 8, 2012 annote = { There has been an increasing volume of evidence supporting the role of the syllable in various word processing tasks. It has, however, been suggested that syllable effects may be caused by orthographic redundancy. In particular, it has been proposed that the presence of bigram troughs at syllable boundaries cause what are seen as syllable effects. We investigated the bigram trough hypothesis as an explanation of the number of syllables effect for lexical decision in five-letter words and nonwords from the British Lexicon Project. The number of syllables made a significant contribution to prediction of lexical decision times along with word frequency and orthographic similarity. The presence of a bigram trough did not. For nonwords, the number of syllables made a significant contribution to prediction of lexical decision times only for nonwords with relatively long decision times. The presence of a bigram trough made no contribution. The evidence presented suggests that the bigram trough cannot be an explanation of the syllable number effect in lexical decision. A comparison of the results from words and nonwords is interpreted as providing some support for dual-route models of reading. }} @inproceedings{chen-12_fast-lexicon-learning-for-grounded-language-acquisition, title={Fast online lexicon learning for grounded language acquisition}, author={Chen, D.L.}, booktitle={Proc. of the Annual Meetings of the Association for Computational Linguistics (ACL)}, year={2012} Abstract Learning a semantic lexicon is often an important first step in building a system that learns to interpret the meaning of natural language. It is especially important in language grounding where the training data usually consist of language paired with an ambiguous perceptual context. Recent work by Chen and Mooney (2011) introduced a lexicon learning method that deals with ambiguous relational data by taking intersections of graphs. While the algorithm produced good lexicons for the task of learning to interpret navigation instructions, it only works in batch settings and does not scale well to large datasets. In this paper we introduce a new online algorithm that is an order of magnitude faster and surpasses the stateof- the-art results. We show that by changing the grammar of the formal meaning representation language and training on additional data collected from Amazon’s Mechanical Turk we can further improve the results. We also include experimental results on a Chinese translation of the training data to demonstrate the generality of our approach. }} @article{mather-plunkett12_role-of-novelty-in-early-word-learning, title={The role of novelty in early word learning}, author={Mather, E. and Plunkett, K.}, journal={Cognitive Science}, year={2012}, publisher={Wiley Online Library} annote = { 22-month old babies know that a new, unfamiliar word is more likely to be associated with a novel object. abstract: What mechanism implements the mutual exclusivity bias to map novel labels to objects without names? Prominent theoretical accounts of mutual exclusivity (e.g., Markman, 1989, 1990) propose that infants are guided by their knowledge of object names. However, the mutual exclusivity constraint could be implemented via monitoring of object novelty (see Merriman, Marazita, & Jarvis, 1995). We sought to discriminate between these contrasting explanations across two preferential looking experiments with 22-month-olds. In Experiment 1, infants viewed three objects: one nameknown, two name-unknown. Of the two name-unknown objects, one was novel, and the other had been previously familiarized. The infants responded to hearing a novel label by increasing attention only to the novel, name-unknown object. In a second experiment in which the name-known object was absent, a novel label increased infants’ attention to a novel object beyond baseline preference for novelty. The experiments provide clear evidence for a novelty-based mechanism. However, differences in the time course of disambiguation across experiments suggest that novelty processing may be influenced by contextual factors. }} @article{mani-mills-12_vowels-in-early-words, title={Vowels in early words: an event-related potential study}, author={Mani, N. and Mills, D.L. and Plunkett, K.}, journal={Developmental Science}, year={2012}, publisher={Wiley Online Library} Abstract Previous behavioural research suggests that infants possess phonologically detailed representations of the vowels and consonants in familiar words. These tasks examine infants’ sensitivity to mispronunciations of a target label in the presence of a target and distracter image. Sensitivity to the mispronunciation may, therefore, be contaminated by the degree of mismatch between the distracter label and the heard mispronounced label. Event-related potential (ERP) studies allow investigation of infants’ sensitivity to the relationship between a heard label (correct or mispronounced) and the referent alone using single picture trials. ERPs also provide information about the timing of lexico-phonological activation in infant word recognition. The current study examined 14-month-olds’ sensitivity to vowel mispronunciations of familiar words using ERP data from single picture trials. Infants were presented with familiar images followed by a correct pronunciation of its label, a vowel mispronunciation or a phonologically unrelated non-word. The results support and extend previous behavioural findings that 14-month-olds are sensitive to mispronunciations of the vowels in familiar words using an ERP task. We suggest that the presence of pictorial context reinforces infants’ sensitivity to mispronunciations of words, and that mispronunciation sensitivity may rely on infants accessing the cross-modal associations between word forms and their meanings. }} @article{caza-knott-12_pragmatic-bootstrapping-neural-network-vocabulary-acquisition, title={Pragmatic bootstrapping: a neural network model of vocabulary acquisition}, author={Caza, G.A. and Knott, A.}, journal={Language Learning and Development}, volume={8}, number={2}, pages={113--135}, year={2012}, publisher={Taylor \& Francis}, annote = { learns from single word input: A final difference in our data representation is that it associates individual words with individual concepts rather than associating multiword utterances with groups of concepts (as, e.g., in Siskind, 1996; Yu & Ballard, 2007). Our streams differ in their granularity by isolating individual concepts and single-word utterances. For simplicity, we assume that the child pays special attention to certain emphasized words and these emphasized words are the ones that appear in our utterance stream. }} @article{greco-carrea-11_grounding-symbols-no-composition-without-discrimination, title={Grounding compositional symbols: no composition without discrimination}, author={Greco, A. and Carrea, E.}, journal={Cognitive Processing}, pages={1--12}, year={2011}, publisher={Springer}, annote = { The classical computational conception of meaning has been challenged by the idea that symbols must be grounded on sensorimotor processes. A difficult question arises from the fact that grounding representations cannot be symbolic themselves but, in order to support compositionality, should work as primitives. This implies that they should be precisely identifiable and strictly connected with discriminable perceptual features. Ideally, each representation should correspond to a single discriminable feature. The present study was aimed at exploring whether feature discrimination is a fundamental requisite for grounding compositional symbols. We studied this problem by using Integral stimuli, composed of two interacting and not separable features. Such stimuli were selected in Experiment 1 as pictures whose component features are easily or barely discriminable (Separable or Integral) on the basis of psychological distance metrics (City-block or Euclidean) computed from similarity judgments. In Experiment 2, either each feature was associated with one word of a two-word expression, or the whole stimulus with a single word. In Experiment 3, the procedure was reversed and words or expressions were associated with whole pictures or separate features. Results support the hypothesis that single words are best grounded by Integral stimuli and composite expressions by Separable stimuli, where a strict association of single words with discriminated features is possible. }} @article{battaglia-borensztajn-bod-12_structured-cognition-rats-to-language, title={Structured cognition and neural systems: From rats to language}, author={Battaglia, F.P. and Borensztajn, G. and Bod, R.}, journal={Neuroscience \& Biobehavioral Reviews}, year={2012}, publisher={Elsevier} annote = { very interesting ideas that suggest that learning grammars is a generalization of an ability to parse the input into hierarchies such as sub-events in an action, regions of a painting, or phrases in sentences. they suggest an approach based on learning all sub-trees of a parse rather than just the bottom level structure - called Data-Oriented parsing or DOP. [code available. ??project??] --abstract-- Much of animal and human cognition is compositional in nature: higher order, complex representations are formed by (rule-governed) combination of more primitive representations. We review here some of the evidence for compositionality in perception and memory, motivating an approach that takes ideas and techniques from computational linguistics to model aspects of structural representation in cognition. We summarize some recent developments in our work that, on the one hand, use algorithms from computational linguistics to model memory consolidation and the formation of semantic memory, and on the other hand use insights from the neurobiology of memory to develop a neurally inspired model of syntactic parsing that improves over existing (not cognitively motivated) models in computational linguistics. These two theoretical studies highlight interesting analogies between language acquisition, semantic memory and memory consolidation, and suggest possible neural mechanisms, implemented in computational algorithms that may underlie memory consolidation. }} ==Humour== @article{marinkovic-baldwin-11_right-hemisphere-joke-appreciation, title={Right hemisphere has the last laugh: neural dynamics of joke appreciation}, author={Marinkovic, K. and Baldwin, S. and Courtney, M.G. and Witzel, T. and Dale, A.M. and Halgren, E.}, journal={Cognitive, Affective, \& Behavioral Neuroscience}, volume={11}, number={1}, pages={113--130}, year={2011}, publisher={Springer} annote = { the neural processes in understanding humour... abstract: Understanding a joke relies on semantic, mnemonic, inferential, and emotional contributions from multiple brain areas. Anatomically constrained magnetoencephalography (aMEG) combining high-density whole-head MEG with anatomical magnetic resonance imaging allowed us to estimate where the humor-specific brain activations occur and to understand their temporal sequence. Punch lines provided either funny, not funny (semantically congruent), or nonsensical (incongruent) replies to joke questions. Healthy subjects rated them as being funny or not funny. As expected, incongruous endings evoke the largest N400m in left-dominant temporo-prefrontal areas, due to integration difficulty. In contrast, funny punch lines evoke the smallest N400m during this initial lexical–semantic stage, consistent with their primed “surface congruity” with the setup question. In line with its sensitivity to ambiguity, the anteromedial prefrontal cortex may contribute to the subsequent “second take” processing, which, for jokes, presumably reflects detection of a clever “twist” contained in the funny punch lines. Joke-selective activity simultaneously emerges in the right prefrontal cortex, which may lead an extended bilateral temporo-frontal network in establishing the distant unexpected creative coherence between the punch line and the setup. This progression from an initially promising but misleading integration from left fronto-temporal associations, to medial prefrontal ambiguity evaluation and right prefrontal reprocessing, may reflect the essential tension and resolution underlying humor. }} ==Tacit knowledge== @article{bargh-schwader-12trics_automaticity-in-cognition, title={Automaticity in social-cognitive processes}, author={Bargh, J.A. and Schwader, K.L. and Hailey, S.E. and Dyer, R.L. and Boothby, E.J.}, journal={Trends in cognitive sciences}, year={2012}, annote = { automaticity has emerged as a broad phencmenon over the past few yearss. 30 years ago [1]: some social-percetual processes e.g. impression formation and stereotyping may have efficient and unintentional components operating outside conscious awareness), has now become a staple in explaining almost all psychological phenomena. two classes of automaticity: ‘preconscious': generated from effortless sensory or perceptual activity and then serve as implicit, unappreciated inputs into conscious and deliberate processes. e.g. behavioral contagion or conformity effects triggered by the perception of others’ behavior and immediate impressions of others based on their facial features or expressions alone, also others driven by automatic sensory perception and the perception of internal states as in embodied cognition and emotional influences, including emotional influences on moral judgment. A major development over the past decade and especially the past 5 years has been the inclusion of motivational and goal pursuit processes into this category of preconsciously automatic processes. Research has shown that goal pursuits can become activated (primed) by relevant situational features; they then operate outside of conscious awareness and guidance. ‘goal-dependent’ or ‘postconscious’ : consequences of prior conscious and intentional thought, such as unconscious components in consciously intended decision-making processes and those that support one's conscious commitment to a relationship partner. (see [1]; also [5]). --from [1]-- [J.A. Bargh Conditional automaticity: varieties of automatic influence on social perception and cognition, J. Uleman, J.A. Bargh (Eds.), Unintended Thought, Guilford (1989), pp. 3–51 ] Bargh p. 5: the thesis that a given cognitive process is either automatic or controlled is incorrect. This assumption results in faulty conclusions... an automatic process is taken to be - intentional, - effortless, - autonomouse, - involuntary, - occurring outside conscious awareness. and anything that has one or two of these criteria are taken to be automatic. However attention, awareness, intention and control do not nec occur together in an all-or-none fashion. automaticity in impression formation and social judgment have shown subjects engaging in task-relevant processing very efficiently, even when attentional resources are scarce. Because these routinized modes of thought are relatively indpendent of conscious attention, they are automatic or effortless . But subjects are following explicit instructions to form an impression or make the judgment, so it is not unintentional. many processing effects that are unintentional may depend on conscious or attentional processing ... e.g. opon perception of the attitude object, text categorization of behavioural information, and most category-priming demonstrations. Processes previously believed to be prototypic examples of automaticity -- a) activation of a word's meaning during reading; b) semantic priming and spreading activation; c) the Stroop color - word interference effect; d) well-practiced visual target detection -- have all been shown to require some intentional resouurces (and not completely effortless). [Dark, Johnston, Myles-Worsley Farah 1985] --abstract-- Over the past several years, the concept of automaticity of higher cognitive processes has permeated nearly all domains of psychological research. In this review, we highlight insights arising from studies in decision-making, moral judgments, close relationships, emotional processes, face perception and social judgment, motivation and goal pursuit, conformity and behavioral contagion, embodied cognition, and the emergence of higher-level automatic processes in early childhood. Taken together, recent work in these domains demonstrates that automaticity does not result exclusively from a process of skill acquisition (in which a process always begins as a conscious and deliberate one, becoming capable of automatic operation only with frequent use) – there are evolved substrates and early childhood learning mechanisms involved as well. }} ==Embodiment== @article{pezzulo-barsalou-cangelosi-11_mechanics-of-embodiment-computational, title={The mechanics of embodiment: a dialog on embodiment and computational modeling}, author={Pezzulo, G. and Barsalou, L.W. and Cangelosi, A. and Fischer, M.H. and McRae, K. and Spivey, M.J.}, journal={Frontiers in psychology}, volume={2}, year={2011}, publisher={Frontiers Media SA} Abstract Embodied theories are increasingly challenging traditional views of cognition by arguing that conceptual representations that constitute our knowledge are grounded in sensory and motor experiences, and processed at this sensorimotor level, rather than being represented and processed abstractly in an amodal conceptual system. Given the established empirical foundation, and the relatively underspecified theories to date, many researchers are extremely interested in embodied cognition but are clamoring for more mechanistic implementations. What is needed at this stage is a push toward explicit computational models that implement sensorimotor grounding as intrinsic to cognitive processes. In this article, six authors from varying backgrounds and approaches address issues concerning the construction of embodied computational models, and illustrate what they view as the critical current and next steps toward mechanistic theories of embodiment. The first part has the form of a dialog between two fictional characters: Ernest, the “experimenter,” and Mary, the “computational modeler.” The dialog consists of an interactive sequence of questions, requests for clarification, challenges, and (tentative) answers, and touches the most important aspects of grounded theories that should inform computational modeling and, conversely, the impact that computational modeling could have on embodied theories. The second part of the article discusses the most important open challenges for embodied computational modeling. }} @article{maouene-smith-08_body-parts-and-early-verbs, title={Body Parts and Early-Learned Verbs}, author={Maouene, J. and Hidaka, S. and Smith, L.B.}, journal={Cognitive Science}, volume={32}, number={7}, pages={1200--1216}, year={2008}, publisher={Wiley Online Library} early verbs are correlated with body parts. body maps - proportional to brain areas devoted to them (homunculus maps At 21 months, verbs involving actions of the mouth and lip are 47% of the “meanings” of all verbs known at this age. Growth in verb meanings from 22 to 24 months overwhelmingly (86% of all new meanings) concerns actions by the limbs. The predominant region of growth after this point is in verbs that specifically involve the hands, counting for 58% of new meanings from 24 to 26 months and 59% of all new meanings from 26 to 30 months. At 30 months, verbs labelling actions involving hands and arms dominate all verb meanings, accounting for 51% of all verbs in children’s total productive vocabulary at 30 months. Together, these body maps provide a developmental picture of verb learning that is strongly organized by the body’s morphology. earlier version: [maouene-06icdl_body-parts-early-verb-acquisition] }} ==Gaze / Attention == @inproceedings{kuriyama-tokunaga-11_gaze-matching-of-referring-expressions-in-collaborative-problem-solving, title={Gaze matching of referring expressions in collaborative problem solving}, author={Kuriyama, N. and Terai, A. and Yasuhara, M. and Tokunaga, T. and Yamagishi, K. and Kusumi, T.}, booktitle={Proceedings of International Workshop on Dual Eye Tracking in CSCW (DUET 2011)}, year={2011} annote = { subjects A and B collaborate in tasks. e.g. A asks B: "Put the big triangle next to the square". Is B looking at the same part of the screen as A? Among pairs who manage to do the task better, this overlap is higher. Abstract. Richardson and Dale (2005) showed that eye gaze matching between speakers and listeners contributed to language comprehension. While their study used a static image as a visual stimulus, and the speech and eye gaze of speakers and that of listeners were recorded serially, we recorded speech in synchronisation with eye gaze of both participants simultaneously in a collaborative problem solving setting. The analysis of the collected data revealed that the eye gaze matching rate is higher in successful pairs than in unsuccessful pairs, and the peak of the matching rate comes at different position from the onset of referring expressions depending on surface form of the expressions. }} ==Number sense / Math== @article{siegler-fazio-12trics_fractions-numerical-development, title={Fractions: the new frontier for theories of numerical development}, author={Siegler, R.S. and Fazio, L.K. and Bailey, D.H. and Zhou, X.}, journal={Trends in Cognitive Sciences}, year={2012}, annote = { January 2013, Vol. 17, No. 1, p. 13-19 our sense of magnitude is located in an area of the brain (intraparietal sulcus, IPS). fractions integrate a large degree of implicit, non-symbolic knowledge (which figure has a higher proportion of blue dots) with symbolic (is 2/3 greater than 3/4). this compact survey paper also covers how educational aspects of how fractions are learned and used. --abstract-- Recent research on fractions has broadened and deepened theories of numerical development. Learning about fractions requires children to recognize that many properties of whole numbers are not true of numbers in general and also to recognize that the one property that unites all real numbers is that they possess magnitudes that can be ordered on number lines. The difficulty of attaining this understanding makes the acquisition of knowledge about fractions an important issue educationally, as well as theoretically. This article examines the neural underpinnings of fraction understanding, developmental and individual differences in that understanding, and interventions that improve the understanding. Accurate representation of fraction magnitudes emerges as crucial both to conceptual understanding of fractions and to fraction arithmetic. }} @article{libertus-feigenson-11_approximate-number-sense-predicts-math-ability-3yo, title={Preschool acuity of the approximate number system correlates with school math ability}, author={Libertus, M.E. and Feigenson, L. and Halberda, J.}, journal={Developmental Science}, year={2011}, annote = { children's rapid response on images w q's such as "are there more yellow dots than blue" are measured. There is a wide spread - tests done on 85 children age 3-5 indicate a variability in accuracy with s.d. (sigma) about 15-20%. response time and weber fraction were also measured. statistical correlation is fond between the numerical ability and early math test scores on Test of Early Math Ability (TEMA-3). As reaction times increase, there is a gentle downward slope in the TEMA scores. However, I couldn't understand fig 2a, where as accuracy increases, the slope is downward as well. This seems to contradict the claim in the text that "faster RT and greater accuracy on the ANS acuity task are associated with higher math ability. }, abstract = { Previous research shows a correlation between individual differences in people’s school math abilities and the accuracy with which they rapidly and nonverbally approximate how many items are in a scene. This finding is surprising because the Approximate Number System (ANS) underlying numerical estimation is shared with infants and with non-human animals who never acquire formal mathematics. However, it remains unclear whether the link between individual differences in math ability and the ANS depends on formal mathematics instruction. Earlier studies demonstrating this link tested participants only after they had received many years of mathematics education, or assessed participants’ ANS acuity using tasks that required additional symbolic or arithmetic processing similar to that required in standardized math tests. To ask whether the ANS and math ability are linked early in life, we measured the ANS acuity of 200 3- to 5-year-old children using a task that did not also require symbol use or arithmetic calculation. We also measured children’s math ability and vocabulary size prior to the onset of formal math instruction. We found that children’s ANS acuity correlated with their math ability, even when age and verbal skills were controlled for. These findings provide evidence for a relationship between the primitive sense of number and math ability starting early in life. }} @article{agrillo-piffer-12_musicians-better-at-magnitude-estimation, title={Musicians outperform non-musicians in magnitude estimation: evidence of a common processing mechanism for time, space and numbers}, author={Agrillo, C. and Piffer, L.}, year={2012}, year={2012}, journal = {Quarterly Journal of Experimental Psychology}, Volume 65, Issue 8, 2012 annote = { It has been proposed that time, space, and numbers may be computed by a common magnitude system. Even though several behavioural and neuroanatomical studies have focused on this topic, the debate is still open. To date, nobody has used the individual differences for one of these domains to investigate the existence of a shared cognitive system. Musicians are known to outperform nonmusicians in temporal discrimination tasks. We therefore observed professional musicians and nonmusicians undertaking three different tasks: temporal (participants were required to estimate which of two tones lasted longer), spatial (which line was longer), and numerical discrimination (which group of dots was more numerous). If time, space, and numbers are processed by the same mechanism, it is expected that musicians will have a greater ability, even in nontemporal dimensions. As expected, musicians were more accurate with regard to temporal discrimination. They also gave better performances in both the spatial and the numerical tasks, but only outside the subitizing range. Our data are in accordance with the existence of a common magnitude system. We suggest, however, that this mechanism may not involve the whole numerical range. SUBITIZE: able to estimate (a small) number without having to count }} ==Belief / Categorization== @article{baillargeon_10trics_false-belief-understanding, title={False-belief understanding in infants}, author={Baillargeon, R. and Scott, R.M. and He, Z.}, journal={Trends in Cognitive Sciences}, volume={14}, number={3}, pages={110--118}, issn={1364-6613}, year={2010}, annote = { In a classic paper from 1983, Wimmer and Penner In the False-Belief task, a toy is hidden in a green box in front of child A and agent B. Then when agent B has left the room, the toy is shifted to an yellow box. Agent B re-enters the room, and child A is now asked - which box will agent A search for the toy in? ALl the 3–4-year olds, and about half the 4–6-year olds invariably answer the "yellow box". (the new location). By age 9, the idea that agent B has a false belief, becomes available to most subjects. Thus, children's ideas about the belief in other agents is different from adults. This aspect has led to a vast literature. Here, the idea is to explore this false belief not by explicit questions, but by their looking patterns, as tested in a violation of expectation task (VoE). based on this they surmise that even 15-months have some awareness that agent B may believe the item to be in the green box. --abstract-- At what age can children attribute false beliefs to others? Traditionally, investigations into this question have used elicited-response tasks in which children are asked a direct question about an agent’s false belief. Results from these tasks indicate that the ability to attribute false beliefs does not emerge until about age 4. However, recent investigations using spontaneous-response tasks suggest that this ability is present much earlier. Here we review results from various spontaneous-response tasks that suggest that infants in the second year of life can already attribute false beliefs about location and identity as well as false perceptions. We also consider alternative interpretations that have been offered for these results, and discuss why elicited-response tasks are particularly difficult for young children. }} @article{ell-ashby-12_unsupervised-category-w-feature-fusion, title={Unsupervised category learning with integral-dimension stimuli}, author={Ell, S.W. and Ashby, F.G. and Hutchinson, S.}, year={2012}, journal = {Quarterly Journal of Experimental Psychology}, volume = 65, issue = 8, pages = {1537-1562}, annote = { How do we form categories? What features are used, which are ignored? --abstract-- Despite the recent surge in research on unsupervised category learning, relatively little research has focused on constrained tasks in which the goal is to learn predefined stimulus clusters [as opp to unconstrained] in the absence of feedback. The few studies that have addressed this issue have focused almost exclusively on stimuli for which it is relatively easy to attend selectively to the component dimensions (i.e., separable dimensions). In the present study, we investigated the ability of participants to learn categories constructed from stimuli for which it is difficult, if not impossible, to attend selectively to the component dimensions (i.e., integral dimensions). The experiments demonstrate that individuals are capable of learning categories constructed from the integral dimensions of brightness and saturation, but this ability is generally limited to category structures requiring selective attention to brightness. As might be expected with integral dimensions, participants were often able to integrate brightness and saturation information in the absence of feedback — an ability not observed in previous studies with separable dimensions. Even so, there was a bias to weight brightness more heavily than saturation in the categorization process, suggesting a weak form of selective attention to brightness. These data present an important challenge for the development of models of unsupervised category learning. }} ==Spatial Cognition== @article{avraamides-galati-denis-12_spatial-info-updating-from-narratives} doi = {10.1080/17470218.2012.712147}, author = {Marios N. Avraamides and Alexia Galati and Francesca Pazzagli and Chiara Meneghetti and Michel Denis} title = {Encoding and updating spatial information presented in narratives} journal = {Quarterly Journal of Experimental Psychology}, annote = { subjects read a description that placed them in a square space (e.g. a hotel lobby), and told about objects placed at the corners and centers around them (e.g. [in the hotel lobby] the swimming pool could be seen in the front, the reception to the left, the elevators to the right, and the lobby entrance at the back). Details are described : (e.g., “the painting depicts a scene from the ancient Greek mythology with the 12 gods from mount Olympus. You stare at the painting for a while thinking that its colours do not match well with those of the courtroom”). Then the subjects read about the protagonist turning to face these objects. Finally, they were asked to turn themselves, and were asked to interpret where objects may be with respect to the protagonists. In one expt, they are asked to turn in the direction the character rotated, and in another, in the opposite direction, but there was no effect of their embodied pose. The initial encoding is what most subjects used. abstract: Four experiments investigated whether directional spatial relations encoded by reading narratives are updated following described protagonist rotations. Participants memorized locations of objects described in short stories that placed them, as the protagonist, in remote settings. After reading a description that the protagonist rotated to the left or the right of the initial orientation, participants made judgements about object relations in the described environment (Experiment 1). Before making these judgments, participants were instructed to physically rotate to match (Experiment 2) or mismatch (Experiment 4) the protagonist's described rotation, and in Experiments 3 and 4 to also visualize the changed relations following rotation. Participants' performance suggested that they relied on the initial representation they constructed during encoding rather than on the updated protagonist-to-object relations. Participants' physical movement to match the described rotation and additional visualization instructions did not facilitate updating through a sensorimotor process. In these respects, updating spatial relations in situation models constructed from narratives differs from updating in perceptually experienced environments. }} @inproceedings{matuszek-herbst-zettlemoyer-12_parsing-commands-to-robot, title = {Learning to parse natural language commands to a robot control system}, author = {Matuszek, C. and Herbst, E. and Zettlemoyer, L. and Fox, D.}, booktitle = {Proc. of the 13th Int’l Symposium on Experimental Robotics (ISER)}, year = {2012}, abstract = { As robots become more ubiquitous and capable of performing complex tasks, the importance of enabling untrained users to interact with them has increased. In response, unconstrained natural-language interaction with robots has emerged as a significant research area. We discuss the problem of parsing natural language commands to actions and control structures that can be readily implemented in a robot execution system. Our approach learns a parser based on example pairs of English commands and corresponding control language expressions. We evaluate this approach in the context of following route instructions through an indoor environment, and demonstrate that our system can learn to translate English commands into sequences of desired actions, while correctly capturing the semantic intent of statements involving complex control structures. The procedural nature of our formal representation allows a robot to interpret route instructions online while moving through a previously unknown environment. }} ==Neuroscience== @article{kravitz-saleem-12trics_ventral-visual-pathway-object-recog, title={The ventral visual pathway: an expanded neural framework for the processing of object quality}, author={Kravitz, D.J. and Saleem, K.S. and Baker, C.I. and Ungerleider, L.G. and Mishkin, M.}, journal={Trends in Cognitive Sciences}, year={2012}, publisher={Elsevier} annote = { January 2013, Vol. 17, No. 1, p.26-49 Since the original characterization of the ventral visual pathway, our knowledge of its neuroanatomy, functional properties, and extrinsic targets has grown considerably. Here we synthesize this recent evidence and propose that the ventral pathway is best understood as a recurrent occipito-temporal network containing neural representations of object quality both utilized and constrained by at least six distinct cortical and subcortical systems. Each system serves its own specialized behavioral, cognitive, or affective function, collectively providing the raison d’eˆtre for the ventral visual pathway. This expanded framework contrasts with the depiction of the ventral visual pathway as a largely serial staged hierarchy culminating in singular object representations and more parsimoniously incorporates attentional, contextual, and feedback effects. Fig. 2b At least six distinct pathways emanate from the occipitotemporal network. 1. occipitotemporo-neostriatal pathway (black lines) originates from every region in the network and supports visually-dependent habit formation and skill learning. 2. projection targeting the ventral striatum (or nucleus accumbens) and supports the assignment of stimulus valence. 3. occipitotemporoamygdaloid pathway supports the processing of emotional stimuli. 4. occipitotemporo-medial temporal pathway targets the perirhinal and entorhinal cortices as well as the hippocampus and supports longterm object and object-context memory. 5. occipitotemporo-orbitofrontal pathway : reward processing 6. occipitotemporo-ventrolateral prefrontal pathway: object working memory }} @article{caggiano-fogassi-rizzolatti-11_view-based-action-recog-motor-neurons, title={View-based encoding of actions in mirror neurons of area F5 in macaque premotor cortex}, author={Caggiano, V. and Fogassi, L. and Rizzolatti, G. and Pomper, J.K. and Thier, P. and Giese, M.A. and Casile, A.}, journal={Current Biology}, year={2011}, annote = { [neuroscience models of action recognition, based on the intriguing discovery of "mirror neurons". These neurons fire when the person does the action himself, but also when they see the action being done by others. This paper proposes that motor neurons are view sensitive - work only for a given view. ] }} @article{mcnealy-mazziotta-11_neural-language-learning, title={Age and experience shape developmental changes in the neural basis of language-related learning}, author={McNealy, K. and Mazziotta, J.C. and Dapretto, M.}, journal={Developmental Science}, year={2011}, annote = { ... neural underpinnings of language learning abstract: One hundred and fifty-six participants, ranging from age 5 to adulthood, underwent functional magnetic resonance imaging (fMRI) while listening to three novel streams of continuous speech, which contained either strong statistical regularities, strong statistical regularities and speech cues, or weak statistical regularities providing minimal cues to word boundaries. only the 5- to 10-year-old children displayed significant signal increases for the stream with low statistical regularities, suggesting an age-related decrease in sensitivity to more subtle statistical cues. Further, in a sample of 78 10-yearolds, we examined the impact of proficiency in a second language and level of pubertal development on learning-related signal increases, showing that the brain regions involved in language learning are influenced by both experiential and maturational factors. }} @article{deshmukh-knierim-11_representation-spatial-entorhinal, title={Representation of non-spatial and spatial information in the lateral entorhinal cortex}, author={Deshmukh, S.S. and Knierim, J.J.}, journal={Frontiers in behavioral neuroscience}, volume={5}, year={2011}, publisher={Frontiers Media SA}, annote = { the role of the hippocampus in memory formation - integrates the what, when and where. e.g. at the dhaba on the lko highway [WHERE] i saw two gunmen for protection [WHAT] day before yesterday [WHEN] place cell: responds to specific location [okeefe 78] : hippocampus as cognitive map The hippocampus is involved in episodic memory in humans,, and possibly in animals. it is also involved in spatial location, which led to the notion that the hippocampus provides a spatial framework to organize memory. tetrode based expts on individual neurons in rat-brain - cable back to recorder hyperdrive - each screw carries a tetrode which records 4 recording points listen to a set of neurons - can disambiguate outputs of single neurons [similar to triangulation] - 70 micron range [may have 1K neurons in this range - but only 30 or so are seen - rest may be silent] bsbe 12oct TALKS.t --Abstract-- Some theories of memory propose that the hippocampus integrates the individual items and events of experience within a contextual or spatial framework. The hippocampus receives cortical input from two major pathways: the medial entorhinal cortex (MEC) and the lateral entorhinal cortex (LEC). During exploration in an open field, the firing fields of MEC grid cells form a periodically repeating, triangular array. In contrast, LEC neurons show little spatial selectivity, and it has been proposed that the LEC may provide non-spatial input to the hippocampus. Here, we recorded MEC and LEC neurons while rats explored an open field that contained discrete objects. LEC cells fired selectively at locations relative to the objects, whereas MEC cells were weakly influenced by the objects. These results provide the first direct demonstration of a double dissociation between LEC and MEC inputs to the hippocampus under conditions of exploration typically used to study hippocampal place cells. }} ==Cognition and evolution== @article{csibra-gergely-11_natural-pedagogy-as-evolution, title={Natural pedagogy as evolutionary adaptation}, author={Csibra, G. and Gergely, G.}, journal={Philosophical Transactions of the Royal Society B: Biological Sciences}, volume={366}, number={1567}, pages={1149--1157}, year={2011}, annote = { very similar to [csibra-gergely-09_natural-pedagogy] which is a slightly lighter treatment. particularly see Hoppit etal 08, lessons from animal teaching abstract: We propose that the cognitive mechanisms that enable the transmission of cultural knowledge by communication between individuals constitute a system of ‘natural pedagogy’ in humans, and represent an evolutionary adaptation along the hominin lineage. We discuss three kinds of arguments that support this hypothesis. First, natural pedagogy is likely to be human-specific: while social learning and communication are both widespread in non-human animals, we know of no example of social learning by communication in any other species apart from humans. Second, natural pedagogy is universal: despite the huge variability in child-rearing practices, all human cultures rely on communication to transmit to novices a variety of different types of cultural knowledge, including information about artefact kinds, conventional behaviours, arbitrary referential symbols, cognitively opaque skills and know-how embedded in means-end actions. Third, the data available on early hominin technological culture are more compatible with the assumption that natural pedagogy was an independently selected adaptive cognitive system than considering it as a by-product of some other human-specific adaptation, such as language. By providing a qualitatively new type of social learning mechanism, natural pedagogy is not only the product but also one of the sources of the rich cultural heritage of our species. see also: hoppitt 08: Lessons from animal teaching }} @article{csibra-gergely-09_natural-pedagogy, title={Natural pedagogy as evolutionary adaptation}, author={Csibra, G. and Gergely, G.}, journal={Trends in cognitive sciences}, volume={13}, number={4}, pages={148--153}, year={2009}, abstract = { We propose that the cognitive mechanisms that enable the transmission of cultural knowledge by communication between individuals constitute a system of ‘natural pedagogy’ in humans, and represent an evolutionary adaptation along the hominin lineage. We discuss three kinds of arguments that support this hypothesis. First, natural pedagogy is likely to be human-specific: while social learning and communication are both widespread in non-human animals, we know of no example of social learning by communication in any other species apart from humans. Second, natural pedagogy is universal: despite the huge variability in child-rearing practices, all human cultures rely on communication to transmit to novices a variety of different types of cultural knowledge, including information about artefact kinds, conventional behaviours, arbitrary referential symbols, cognitively opaque skills and know-how embedded in means-end actions. Third, the data available on early hominin technological culture are more compatible with the assumption that natural pedagogy was an independently selected adaptive cognitive system than considering it as a by-product of some other human-specific adaptation, such as language. By providing a qualitatively new type of social learning mechanism, natural pedagogy is not only the product but also one of the sources of the rich cultural heritage of our species. }}