Selected publications by year:
2024
2023
- Elizabeth Nielsen, Christo Kirov and Brian Roark. 2023. Distinguishing Romanized Hindi from Romanized Urdu. In Proceedings of the Workshop on Computation and Written Language (CAWL 2023), pp. 33–42.
- Elizabeth Nielsen, Christo Kirov and Brian Roark. 2023. Spelling convention sensitivity in neural language models. In Findings of EACL, pp. 1304–1316.
- Sebastian Ruder, Jonathan H. Clark, Alexander Gutkin, Mihir Kale, Min Ma, Massimo Nicosia, Shruti Rijhwani, Parker Riley, Jean-Michel Sarr, Xinyi Wang, John Wieting, Nitish Gupta, Anna Katanova, Christo Kirov, Dana L. Dickinson, Brian Roark, Bidisha Samanta, Connie Tao, David I. Adelani, Vera Axelrod, Isaac Caswell, Colin Cherry, Dan Garrette, Reeve Ingle, Melvin Johnson, Dmitry Panteleev and Partha Talukdar. 2023. XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages. In Findings of EMNLP, pp. 1856-1884. preprint
2022
- Işın Demirşahin, Cibu Johny, Alexander Gutkin and Brian Roark. 2022. Criteria for Useful Automatic Romanization in South Asian Languages. In Proceedings of LREC, pp. 6662–6673.
- Raiomond Doctor, Alexander Gutkin, Cibu Johny, Brian Roark and Richard Sproat. 2022. Graphemic Normalization of the Perso-Arabic Script. In Proceedings of Grapholinguistics in the 21st Century, pp. 315-375. preprint
- Alexander Gutkin, Cibu Johny, Raiomond Doctor, Brian Roark and Richard Sproat. 2022. Beyond Arabic: Software for Perso-Arabic Script Manipulation. In Proceedings of the Arabic Natural Language Processing Workshop (WANLP), pp. 381–387.
- Alexander Gutkin, Cibu Johny, Raiomond Doctor, Lawrence Wolf-Sonkin and Brian Roark. 2022. Extensions to Brahmic script processing within the Nisaba library: new scripts, languages and utilities. In Proceedings of LREC, pp. 6450–6460.
- Brian Roark and Alexander Gutkin. 2022. Design principles of an open-source language modeling microservice package for AAC text-entry applications. In Proceedings of the ACL Workshop on Speech and Language Processing for Assistive Technologies (SLPAT), pp. 1--16.
2021
- Kyle Gorman, Christo Kirov, Brian Roark and Richard Sproat. 2021. Structured abbreviation expansion in context. In Findings of EMNLP, pp. 995--1005. preprint
- Cibu Johny, Lawrence Wolf-Sonkin, Alexander Gutkin and Brian Roark. 2021. Finite-state script normalization and processing utilities: The Nisaba Brahmic library. In Proceedings of EACL Demo Session, pp. 14--23.
- Tiago Pimentel, Ryan Cotterell and Brian Roark. 2021. Disambiguatory Signals are Stronger in Word-initial Positions. In Proceedings of EACL, pp. 31--41.
- Tiago Pimentel, Brian Roark, Søren Wichmann, Ryan Cotterell and Damian Blasi. 2021. Finding Concept-specific Biases in Form--Meaning Associations. In Proceedings of NAACL, pp. 4416--4425. preprint.
- Ananda Theertha Suresh, Brian Roark, Michael Riley and Vlad Schogol. 2021. Approximating probabilistic models as weighted finite automata. Computational Linguistics, 47(2):221–-254.
2020
- Brian Roark, Lawrence Wolf-Sonkin, Christo Kirov, Sabrina J. Mielke, Cibu Johny, Işın Demirşahin and Keith Hall. 2020. Processing South Asian languages written in the Latin script: The Dakshina dataset. In Proceedings of LREC. pp. 2413-2423.
- Arindrima Datta, Bhuvana Ramabhadran, Jesse Emond, Anjuli Kannan and Brian Roark. 2020. Language-agnostic Multilingual Modeling. In Proceedings of ICASSP.
- Tiago Pimentel, Brian Roark and Ryan Cotterell. 2020. Phonotactic Complexity and Its Trade-offs. Transactions of the ACL (TACL), 8:1-18.
2019
- Ananda Theertha Suresh, Brian Roark, Michael Riley, and Vlad Schogol. 2019. Distilling weighted finite automata from arbitrary probabilistic models. In Proceedings of FSMNLP, pp. 87-97.
- Lawrence Wolf-Sonkin, Vlad Schogol, Brian Roark, and Michael Riley. 2019. Latin script keyboards for South Asian languages with finite-state normalization. In Proceedings of FSMNLP, pp. 108-117.
- Tiago Pimentel, Arya D. McCarthy, Damian Blasi, Brian Roark and Ryan Cotterell. 2019. Meaning to Form: Measuring Systematicity as Information. In Proceedings of ACL, pp. 1751–1764.
- Sabrina J. Mielke, Ryan Cotterell, Kyle Gorman, Brian Roark and Jason Eisner. 2019. What Kind of Language Is Hard to Language-Model? In Proceedings of ACL, pp. 4975–4989.
- Hao Zhang, Richard Sproat, Axel H. Ng, Felix Stahlberg, Xiaochang Peng, Kyle Gorman and Brian Roark. 2019. Neural Models of Text Normalization for Speech Applications. Computational Linguistics, 45(2):293-337.
2018
2017
2016
- Cyril Allauzen, Michael Riley and Brian Roark. 2016. Distributed representation and estimation of WFST-based n-gram models. In ACL Workshop on statistical NLP and weighted automata (StatFSM), pp. 32-41.
- Yoni Halpern, Keith Hall, Vlad Schogol, Michael Riley, Brian Roark, Gleb Skobeltsyn and Martin Baeuml. 2016. Contextual prediction models for speech recognition. In Proceedings of Interspeech, pp. 2338-2342.
- Vitaly Kuznetsov, Hank Liao, Mehryar Mohri, Michael Riley and Brian Roark. 2016. Learning n-gram language models from uncertain data. In Proceedings of Interspeech, pp. 2323-2327.
- Umut Orhan, Hooman Nezamfar, Murat Akcakaya, Deniz Erdogmus, Matt Higger, Mohammad Moghadamfalahi, Andrew Fowler, Brian Roark, Barry Oken and Melanie Fried-Oken. 2016. Probabilistic simulation framework for EEG-based BCI design. In Brain-Computer Interfaces, 3(4):171-185. pdf
2015
- Petar Aleksic, Mohammadreza Ghodsi, Assaf Michaely, Cyril Allauzen,
Keith Hall, Brian Roark, David Rybach and Pedro Moreno. 2015. Bringing Contextual Information to Google Speech Recognition. In Proceedings of Interspeech, pp. 468-472.
- Keith Hall, Eunjoon Cho, Cyril Allauzen, Francoise Beaufays, Noah Coccaro, Kaisuke Nakajima, Michael Riley, Brian Roark, David Rybach and Linda Zhang. 2015. Composition-based on-the-fly rescoring for salient n-gram biasing. In Proceedings of Interspeech, pp. 1418-1422.
- Emily Prud’hommeaux and Brian Roark. 2015. Graph-based word alignment for clinical language evaluation. In Computational Linguistics, 41(4):549-578.
- Brian Roark, Melanie Fried-Oken and Chris Gibbons. 2015. Huffman and Linear Scanning Methods with Statistical Language Models. In Augmentative and Alternative Communication, 31(1):37-50. pdf
2014
- Fadi Biadsy, Keith Hall, Pedro Moreno and Brian Roark. 2014. Backoff Inspired Features for Maximum Entropy Language Models. In Proceedings of Interspeech.
- Eric Morley, Anna Eva Hallin and Brian Roark. 2014. Data Driven Grammatical Error Detection in Transcripts of Children's Speech. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 980-989.
- Eric Morley, Anna Eva Hallin and Brian Roark. 2014. Challenges in Automating Maze Detection. In Proceedings of the ACL Workshop on Computational Linguistics and Clinical Psychology, pp. 69-77.
- Barry S. Oken, Umut Orhan, Brian Roark, Deniz Erdogmus, Andrew Fowler, Aimee Mooney, Betts Peters, Meghan Miller, and Melanie B. Fried-Oken. 2014. Brain-computer interface with language model-Electroencephalography fusion for locked-in syndrome. Neurorehabilitation and Neural Repair, 28(4):387-394.
- Emily Prud’hommeaux, Eric Morley, Masoud Rouhizadeh, Laura Silverman, Jan van Santen, Brian Roark, Richard Sproat, Sarah Kauper and Rachel DeLaHunta. 2014. Computational Analysis of the Trajectories of Linguistic Development in Autism. In Proceedings of the IEEE Spoken Language Technology Workshop (SLT).
- Brian Roark and Richard Sproat. 2014. Hippocratic Abbreviation Expansion. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 364-369.
- Richard Sproat, Mahsa Yarmohammadi, Izhak Shafran and Brian Roark. 2014. Applications of Lexicographic Semirings to Problems in Speech and Language Processing. Computational Linguistics, 40(4):733–761.
- Ke Wu, Cyril Allauzen, Keith Hall, Michael Riley and Brian Roark. 2014. Encoding Linear Models As Weighted Finite-State Transducers. In Proceedings of Interspeech.
- Mahsa Yarmohammadi, Aaron Dunlop and Brian Roark. 2014. Transforming trees into hedges and parsing with "hedgebank" grammars. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 797-802.
2013
- Russ Beckley and Brian Roark. 2013. Pair Language Models for Deriving Alternative Pronunciations and Spellings from Pronunciation Dictionaries. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1584-1589.
- Erinç Dikici, Emily Prud'hommeaux, Brian Roark and Murat Saraçlar. 2013. Investigation of MT-based ASR Confusion Models for Semi-Supervised Discriminative Language Modeling. In Proceedings of Interspeech.
- Andrew Fowler, Brian Roark, Umut Orhan, Deniz Erdogmus and Melanie Fried-Oken. 2013. Improved inference and autotyping in EEG-based BCI typing systems. In Proceedings of the 15th ACM SIGACCESS International Conference on Computers and Accessibility (ASSETS).
- Maider Lehr, Izhak Shafran, Emily Prud’hommeaux and Brian Roark. 2013. Discriminative Joint Modeling of Lexical Variation and Acoustic Confusion for Automated Narrative Retelling Assessment. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 211-220.
- Eric Morley, Brian Roark and Jan van Santen. 2013. The Utility of Manual and Automatic Linguistic Error Codes for Identifying Neurodevelopmental Disorders. In Proceedings of the NAACL-HLT 2013 8th Workshop on Innovative Use of NLP for Building Educational Applications (BEA8), pp. 1-10.
- Umut Orhan, Deniz Erdogmus, Brian Roark, Barry S. Oken and Melanie Fried-Oken. 2013. Offline Analysis of Context Contribution to ERP-based Typing BCI Performance. Journal of Neural Engineering, 10(6):066003. pdf
- Brian Roark, Cyril Allauzen and Michael Riley. 2013. Smoothed marginal distribution constraints for language modeling. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 43-52.
- Brian Roark, Russ Beckley, Chris Gibbons and Melanie Fried-Oken. 2013. Huffman scanning: using language models within fixed-grid keyboard emulation. Computer Speech and Language, 27(6): 1212-1234. pdf PubMed
- Masoud Rouhizadeh, Emily Prud’hommeaux, Brian Roark and Jan van Santen. 2013. Distributional semantic models for the evaluation of disordered speech. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 709-714.
2012
- Ebru Arisoy, Murat Saraçlar, Brian Roark and Izhak Shafran. 2012. Discriminative Language Modeling with Linguistic and Statistically Derived Features. IEEE Transactions on Audio, Speech and Language Processing, 20(2):540-550. pdf
- Steven Bedrick, Russell Beckley, Brian Roark and Richard Sproat. 2012. Robust kaomoji detection in Twitter. In Proceedings of the Second Workshop on Language in Social Media, pp. 56-64.
- Arda Çelebi, Hasim Sak, Erinç Dikici, Murat Saraçlar, Maider Lehr, Emily T. Prud'hommeaux, Puyang Xu, Nathan Glenn, Damianos Karakos, Sanjeev Khudanpur, Brian Roark, Kenji Sagae, Izhak Shafran, Daniel Bikel, Chris Callison-Burch, Yuan Cao, Keith Hall, Eva Hasler, Philipp Koehn, Adam Lopez, Matt Post, Darcey Riley. 2012. Semi-supervised discriminative language modeling for Turkish ASR. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 5025-5028.
- Jeffrey Higginbotham, Bryan Moulton, Greg Lesher and Brian Roark. 2012. The Application of Natural Language Processing to Augmentative and Alternative Communication. Assistive Technology, 24(1):14-24. pdf
- Maider Lehr, Emily T. Prud’hommeaux, Izhak Shafran and Brian Roark. 2012. Fully Automated Neuropsychological Assessment for Detecting Mild Cognitive Impairment. In Proceedings of Interspeech.
- Damianos Karakos, Brian Roark, Izhak Shafran, Kenji Sagae, Maider Lehr, Emily T. Prud'hommeaux, Puyang Xu, Nathan Glenn, Sanjeev Khudanpur, Murat Saraçlar, Daniel Bikel, Mark Dredze, Chris Callison-Burch, Yuan Cao, Keith Hall, Eva Hasler, Philipp Koehn, Adam Lopez, Matt Post and Darcey Riley. 2012. Deriving conversation-based features from unlabeled speech for discriminative language modeling. In Proceedings of Interspeech.
- Umut Orhan, Deniz Erdogmus, Brian Roark, Barry Oken, Shalini Purwar, Kenneth E. Hild II, Andrew Fowler and Melanie Fried-Oken. 2012. Improved Accuracy Using Recursive Bayesian Estimation Based Language Model Fusion in ERP-Based BCI Typing Systems. In Proceedings of the 34th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC’12).
- Umut Orhan, Kenneth E. Hild II, Deniz Erdogmus, Brian Roark, Barry Oken, Melanie Fried-Oken. 2012. RSVP Keyboard: an EEG based typing interface. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 645-648.
- Emily Prud'hommeaux and Brian Roark. 2012. Graph-based alignment of narratives for automated neurological assessment. In Proceedings of the 2012 Workshop on Biomedical Natural Language Processing (BioNLP), pp. 1-10.
- Brian Roark, Kristy Hollingshead and Nathan Bodenstab. 2012. Finite-state chart constraints for reduced complexity context-free parsing pipelines. Computational Linguistics, 38(4):719–753.
- Brian Roark, Richard Sproat, Cyril Allauzen, Michael Riley, Jeffrey Sorensen and Terry Tai. 2012. The OpenGrm open-source finite-state grammar software libraries. In Proceedings of the ACL 2012 System Demonstrations, pp. 61-66.
- Kenji Sagae, Maider Lehr, Emily T. Prud'hommeaux, Puyang Xu, Nathan Glenn, Damianos Karakos, Sanjeev Khudanpur, Brian Roark, Murat Saraçlar, Izhak Shafran, Daniel Bikel, Chris Callison-Burch, Yuan Cao, Keith Hall, Eva Hasler, Philipp Koehn, Adam Lopez, Matt Post and Darcey Riley. 2012. Hallucinated n-best lists for discriminative language modeling. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 5001-5004.
- Puyang Xu, Sanjeev Khudanpur, Maider Lehr, Emily T. Prud'hommeaux, Nathan Glenn, Damianos Karakos, Brian Roark, Kenji Sagae, Murat Saraçlar, Izhak Shafran, Daniel Bikel, Chris Callison-Burch, Yuan Cao, Keith Hall, Eva Hasler, Philipp Koehn, Adam Lopez, Matt Post and Darcey Riley. 2012. Continuous space discriminative language modeling. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 2129-2132.
- Puyang Xu, Sanjeev Khudanpur and Brian Roark. 2012. Phrasal Cohort Based Unsupervised Discriminative Language Modeling. In Proceedings of Interspeech.
2011
- Russell Beckley and Brian Roark. 2011. Asynchronous fixed-grid scanning with dynamic codes. In Proceedings of the 2nd Workshop on Speech and Language Processing for Assistive Technologies (SLPAT).
- Nathan Bodenstab, Aaron Dunlop, Keith Hall and Brian Roark. 2011. Beam-Width Prediction for Efficient CYK Parsing. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 440-449.
- Nathan Bodenstab, Kristy Hollingshead and Brian Roark. 2011. Unary Constraints for Context-Free Parsing. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), Short papers, pp. 676-681.
- Aaron Dunlop, Nathan Bodenstab and Brian Roark. 2011. Efficient matrix-encoded grammars and low latency parallelization strategies for CYK. In Proceedings of the 12th International Conference on Parsing Technologies (IWPT), pp. 163-174.
- Kenneth Hild, Umut Orhan, Deniz Erdogmus, Brian Roark, Barry Oken, Shalini Purwar, Hooman Nezamfar and Melanie Fried-Oken. 2011. An ERP-based Brain-Computer Interface for text entry using Rapid Serial Visual Presentation and Language Modeling. In Proceedings of the ACL-HLT 2011 System Demonstrations, pp. 38-43.
- Zhifei Li, Ziyuan Wang, Jason Eisner, Sanjeev Khudanpur and Brian Roark. 2011. Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 920-929.
- Margaret Mitchell, Aaron Dunlop and Brian Roark. 2011. Semi-supervised Modeling for Prenominal Modifier Ordering. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), Short papers, pp. 236-241.
- Umut Orhan, Deniz Erdogmus, Brian Roark, Shalini Purwar, Kenneth Hild II, Barry Oken, Hooman Nezamfar, Melanie Fried-Oken. 2011. Fusion with Language Models Improves Spelling Accuracy for ERP-based Brain Computer Interface Spellers. In 33nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC’11).
- Emily T. Prud’hommeaux, Brian Roark, Jan van Santen and Lois Black. 2011. Classification of atypical language in autism. In Proceedings of the ACL 2011 Workshop on Cognitive Modeling and Computational Linguistics, pp. 88-96.
- Emily T. Prud’hommeaux and Brian Roark. 2011. Extraction of narrative recall patterns for neuropsychological assessment. In Proceedings of Interspeech, pp. 3021-3024.
- Emily T. Prud'hommeaux and Brian Roark. 2011. Alignment of spoken narratives for automated neuropsychological assessment. In Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
- Emily T. Prud’hommeaux, Margaret Mitchell and Brian Roark. 2011. Using Patterns of Narrative Recall for Improved Detection of Mild Cognitive Impairment. In Conference of the Rehabilitation Engineering and Assistive Technology Society of North America (RESNA) and 3rd International Conference on Technology and Aging (ICTA).
- Brian Roark, Margaret Mitchell, John-Paul Hosom, Kristy Hollingshead and Jeffrey A. Kaye. 2011. Spoken language derived measures for detecting Mild Cognitive Impairment. IEEE Transactions on Audio, Speech and Language Processing, 19(7):2081-2090. pdf Also available on PubMed.
- Brian Roark, Richard Sproat and Izhak Shafran. 2011. Lexicographic Semirings for Exact Automata Encoding of Sequence Models. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), Short papers, pp. 1-5.
- Brian Roark, Andrew Fowler, Richard Sproat, Chris Gibbons and Melanie Fried-Oken. 2011. Towards technology-assisted co-construction with communication partners. In Proceedings of the 2nd Workshop on Speech and Language Processing for Assistive Technologies (SLPAT).
- Brian Roark. 2011. Expected surprisal and entropy. Technical Report #CSLU-11-004, Center for Spoken Language Processing, Oregon Health & Science University.
- Izhak Shafran, Richard Sproat, Mahsa Yarmohammadi and Brian Roark. 2011. Efficient Determinization of Tagged Word Lattices using Categorial and Lexicographic Semirings. In Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
- Final report, Confusion-based Statistical Language Modeling for Machine Translation and Speech Recognition, JHU CLSP summer workshop project, 2011.
2010
- Ebru Arisoy, Murat Saraçlar, Brian Roark and Izhak Shafran. 2010. Syntactic and sub-lexical features for Turkish discriminative language models. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).
- Aaron Dunlop, Margaret Mitchell and Brian Roark. 2010. Prenominal Modifier Ordering via Multiple Sequence Alignment. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), pp. 600-608.
- Christian Monson, Kristy Hollingshead, and Brian Roark. 2010. Simulating Morphological Analyzers with Stochastic Taggers for Confidence Estimation. In Multilingual Information Access Evaluation, Vol. I, 10th Workshop of the Cross-Language Evaluation Forum, CLEF 2009, Corfu, Greece, Revised Selected Papers. Lecture Notes in Computer Science, Springer.
- Brian Roark, Jacques de Villiers, Christopher Gibbons and Melanie Fried-Oken. 2010. Scanning methods and language modeling for binary switch typing. In Proceedings of the NAACL-HLT Workshop on Speech and Language Processing for Assistive Technologies (SLPAT), pp. 28-36.
- Tzvetan Tchoukalov, Christian Monson, and Brian Roark. 2010. Morphological Analysis by Multiple Sequence Alignment. In Multilingual Information Access Evaluation, Vol. I, 10th Workshop of the Cross-Language Evaluation Forum, CLEF 2009, Corfu, Greece, Revised Selected Papers. Lecture Notes in Computer Science, Springer.
- Chris Whelan, Brian Roark and Kemal Sonmez. 2010. Designing Antimicrobial Peptides with Weighted Finite State Transducers. In 32nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC'10), Buenos Aires.
2009
2008
2007
- Dimitra Vergyri, Izhak Shafran, Andreas Stolcke, Ramana R. Gadde, Murat Akbacak, Brian Roark, and Wen Wang. 2007. The SRI/OGI 2006 Spoken Term Detection System. In Proceedings of Interspeech, Antwerp, Belgium, August.
- Brian Roark, Margaret Mitchell and Kristy Hollingshead. 2007. Syntactic complexity measures for detecting Mild Cognitive Impairment. In Proceedings of the ACL 2007 Workshop on Biomedical Natural Language Processing (BioNLP), pp. 1-8.
- Seeger Fisher and and Brian Roark. 2007. The utility of parse-derived features for automatic discourse segmentation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 488-495.
- Kristy Hollingshead and Brian Roark. 2007. Pipeline Iteration. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 952-959.
- Brian Roark, John-Paul Hosom, Margaret Mitchell and Jeffrey A. Kaye. 2007. Automatically derived spoken language markers for detecting Mild Cognitive Impairment. In Proceedings of the 2nd International Conference on Technology and Aging (ICTA).
- Brian Roark, Murat Saraçlar and Michael Collins. 2007. Discriminative n-gram language modeling. Computer Speech and Language, 21(2):373-392. pdf
- Brian Roark and Richard Sproat. 2007. Computational Approaches to Morphology and Syntax. Oxford University Press, Oxford.
- Brian Roark. 2007. Structural Alignment for Finite-State Syntactic Processing. Technical Report #CSLU-07-001, Center for Spoken Language Processing, Oregon Health & Science University.
2006
- Michiel Bacchiani, Michael Riley, Brian Roark and Richard Sproat. 2006. MAP Adaptation of Stochastic Grammars. Computer Speech and Language, 20(1):41-68.
- John Hale, Izhak Shafran, Lisa Yung, Bonnie Dorr, Mary Harper, Anna Krasnyanskaya, Matthew Lease, Yang Liu, Brian Roark, Matthew Snover and Robin Stewart. 2006. PCFGs with Syntactic and Prosodic Indicators of Speech Repairs. In Proceedings of COLING and Annual Meeting of the Association for Computational Linguistics (COLING-ACL), pp. 161-168.
- Mehryar Mohri and Brian Roark. 2006. Probabilistic Context-Free Grammar Induction Based on Structural Zeros. In Proceedings of the Human Language Technology Conference and Meeting of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), pp. 312-319.
- Brian Roark, Mary Harper, Eugene Charniak, Bonnie Dorr, Mark Johnson, Jeremy Kahn, Yang Liu, Mari Ostendorf, John Hale, Anna Krasnyanskaya, Matthew Lease, Izhak Shafran, Matthew Snover, Robin Stewart and Lisa Yung. 2006. SParseval: Evaluation Metrics for Parsing Speech. In Proceedings of the Language Resources and Evaluation Conference (LREC), Genoa, Italy.
- Brian Roark, Yang Liu, Mary Harper, Robin Stewart, Matthew Lease, Matthew Snover, Izhak Shafran, Bonnie Dorr, John Hale, Anna Krasnyanskaya and Lisa Yung. 2006. Reranking for Sentence Boundary Detection in Conversational Speech. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toulouse, France.
- Murat Saraçlar and Brian Roark. 2006. Utterance Classification with Discriminative Language Modeling. Speech Communication, 48(3-4):276-287.
2005
- Cyril Allauzen, Mehryar Mohri and Brian Roark. 2005. The Design Principles and Algorithms of a Weighted Grammar Library. International Journal of Foundations of Computer Science, 16(3):403-421. pdf
- Cyril Allauzen, Mehryar Mohri and Brian Roark. 2005. A General Weighted Grammar Library. In Proceedings of the Ninth International Conference on Implementation and Application of Automata (CIAA), volume 3317 of Lecture Notes in Computer Science, Springer-Verlag, pages 23-34.
- Michael Collins, Murat Saraçlar and Brian Roark. 2005. Discriminative Syntactic Language Modeling for Speech Recognition. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), pages 507-514
- Kristy Hollingshead, Seeger Fisher and Brian Roark. 2005. Comparing and Combining Finite-State and Context-Free Parsers. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT-EMNLP), pages 787-794.
- Mehryar Mohri and Brian Roark. 2005. Structural Zeros versus Sampling Zeros. Technical Report #CSE-05-003, Computer Science & Electrical Engineering, Oregon Health & Science University.
- Murat Saraçlar and Brian Roark. 2005. Joint Discriminative Language Modeling and Utterance Classification. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 561-564.
2004
- Cyril Allauzen, Mehryar Mohri, Michael Riley and Brian Roark. 2004. A Generalized Construction of Speech Recognition Transducers. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 761-764.
- Michiel Bacchiani and Brian Roark. 2004. Meta-data conditional language modeling. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 241-244.
- Michiel Bacchiani, Brian Roark and Murat Saraçlar. 2004. Language model adaptation with MAP estimation and the perceptron algorithm. In Proceedings of the Human Language Technology Conference and Meeting of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), pages 21-24.
- Michael Collins and Brian Roark. 2004. Incremental Parsing with the Perceptron Algorithm. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL), pages 111-118.
- Sameer R. Maskey, Michiel Bacchiani, Brian Roark and Richard Sproat. 2004. Improved name recognition with meta-data dependent name networks. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 789-792.
- Brian Roark. 2004. Robust garden path parsing. Natural Language Engineering, 10(1), pages 1-24.
- Brian Roark, Murat Saraçlar and Michael Collins. 2004. Corrective language modeling for large vocabulary ASR with the perceptron algorithm. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 749-752.
- Brian Roark, Murat Saraçlar, Michael Collins and Mark Johnson. 2004. Discriminative language modeling with conditional random fields and the perceptron algorithm. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL), pages 47-54.
2001-2003
- Cyril Allauzen, Mehryar Mohri, and Brian Roark. 2003. Generalized algorithms for constructing language models. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL), pages 40-47.
- Michiel Bacchiani and Brian Roark. 2003. Unsupervised language model adaptation. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 224-227.
- Michael Riley, Brian Roark and Richard Sproat. 2003. Good-Turing estimation from word lattices for unsupervised language model adaptation. In Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pages 453-458.
- Brian Roark and Michiel Bacchiani. 2003. Supervised and unsupervised PCFG adaptation to novel domains. In Proceedings of the Human Language Technology Conference and Meeting of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL).
- Brian Roark. 2002. Markov Parsing: Lattice Rescoring with a Statistical Parser. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pages 287-294.
- Brian Roark. 2001. Robust Probabilistic Predictive Syntactic Processing: Motivations, Models, and Applications. Ph.D. Thesis, Department of Cognitive and Linguistic Sciences, Brown University.
- Brian Roark. 2001. Probabilistic top-down parsing and language modeling. In Computational Linguistics, 27(2), pages 249-276.
- Brian Roark. 2001.
Explaining vowel inventory tendencies via simulation: finding a role for quantal locations and formant normalization. In Proceedings of the 31st Conference of the North East Linguistics Society (NELS 31).
Earlier