"Most Cited": Language Reources and Evaluation

The 200 most cited articles

1. Annotating expressions of opinions and emotions in language

2. The waCky wide web: A collection of very large linguistically processed web-crawled corpora

3. IEMOCAP: Interactive emotional dyadic motion capture database

4. Unleashing the killer corpus: Experiences in creating the multi-everything AMI Meeting Corpus

5. How variable may a constant be? Measures of lexical richness in perspective

6. The MUMIN coding scheme for the annotation of feedback, turn management and sequencing phenomena

7. A large-scale classification of English verbs

8. Authorship attribution in the wild

9. A multidimensional approach for detecting irony in Twitter

10. I don't believe in word senses

11. Cross-language plagiarism detection

12. Factbank: A corpus annotated with event factuality

13. Lexical association measures and collocation extraction

14. Computer-based authorship attribution without lexical measures

15. Intrinsic plagiarism analysis

16. The English lexical substitution task

17. Temporal and event information in natural language text

18. The tempEval challenge: Identifying temporal relations in text

19. The challenge of optical music recognition

20. Framework and results for English SENSEVAL

21. Neural network applications in stylometry: The federalist papers

22. Multilingual and cross-domain temporal tagging

23. Interchanging lexical resources on the Semantic Web

24. A semantic network of English: The mother of all WordNets

25. A web-based Bengali news corpus for named entity recognition

26. Developing a corpus of plagiarised short answers

27. Language resources for Hebrew

28. An annotation scheme for conversational gestures: How to economically capture timing and form

29. The CHIL audiovisual corpus for lecture and meeting analysis inside smart rooms

30. Comparative evaluation of text classification techniques using a large diverse Arabic dataset

31. Perspectives on crowdsourcing annotations for natural language processing

32. Microblog language identification: Overcoming the limitations of short, unedited and idiomatic text

33. Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter messages

34. The TORGO database of acoustic and articulatory speech from speakers with dysarthria

35. AnCora-CO: Coreferentially annotated corpora for Spanish and Catalan

36. The NXT-format Switchboard Corpus: A rich resource for investigating the syntax, semantics, pragmatics and prosody of dialogue

37. Classification of semantic relations between nominals

38. The NITE XML toolkit: Data model and query language

39. A multimodal annotated corpus of consensus decision making meetings

40. The ACL anthology network corpus

41. A multilingual ontology for infectious disease surveillance: Rationale, design and challenges

42. Introduction to EuroWordNet

43. Virtual agent multimodal mimicry of humans

44. Unsupervised morphological parsing of Bengali

45. Automatic keyphrase extraction from scientific articles

46. Creating a live, public short message service corpus: The NUS SMS corpus

47. Creating a system for lexical substitutions from scratch using crowdsourcing

48. Thesaurus or logical ontology, which one do we need for text mining?

49. The bible as a parallel corpus: Annotating the "book of 2000 tongues"

50. WordNet then and now

51. MULTEXT-East: Morphosyntactic resources for Central and Eastern European languages

52. Corpus-based generation of head and eyebrow motion for an embodied conversational agent

53. Guidelines for word alignment evaluation and manual alignment

54. On the evaluation and improvement of Arabic WordNet coverage and usability

55. Lessons from building a Persian written corpus: Peykare

56. Japanese/english cross-language information retrieval: Exploration of query translation and transliteration

57. SpatialML: Annotation scheme, resources, and evaluation

58. Compositionality and lexical alignment of multi-word terms

59. TimeBank evolution as a community resource for TimeML parsing

60. The chicken-and-egg problem in wordnet design: Synonymy, synsets and constitutive relations

61. The Corpus DIMEx100: Transcription and evaluation

62. Temporal closure in an annotation environment

63. Hierarchical decision lists for word sense disambiguation

64. Balanced corpus of contemporary written Japanese

65. Dannet: The challenge of compiling a wordnet for Danish by reusing a monolingual dictionary

66. Getting to the heart of the matter: Speech as the expression of affect; Rather than just text or language

67. Do word meanings exist?

68. The top-down strategy for building EuroWordNet: Vocabulary coverage, base concepts and top ontology

69. The state of authorship attribution studies: Some problems and solutions

70. ECO and Onto.PT: A flexible approach for creating a Portuguese wordnet automatically

71. IPLR: An online resource for Greek word-level and sublexical information

72. Improving English verb sense disambiguation performance with linguistically motivated features and clear sense distinction boundaries

73. Evaluation of machine learning-based information extraction algorithms: Criticisms and recommendations

74. A novel approach for ranking spelling error corrections for Urdu

75. And then there were none: Winnowing the shakespeare claimants

76. GATE Teamware: A web-based, collaborative text annotation framework

77. Glissando: A corpus for multidisciplinary prosodic studies in Spanish and Catalan

78. Alcohol language corpus: The first public corpus of alcoholized German speech

79. Annotating expressions of Appraisal in English

80. A survey of methods to ease the development of highly multilingual text mining applications

81. The Hamburg Metaphor Database project: Issues in resource creation

82. A corpus for studying addressing behaviour in multi-party dialogues

83. Stephen Crane and the New-York Tribune: A case study in traditional and non-traditional authorship attribution

84. SALDO: A touch of yin to WordNet's yang

85. Coupling an annotated corpus and a lexicon for state-of-the-art POS tagging

86. FrameNet, current collaborations and future goals

87. Challenges for a multilingual wordnet

88. Data and models for metonymy resolution

89. Multilingual resources for NLP in the lexical markup framework (LMF)

90. Combining linguistic resources to create a machine-tractable Japanese-Malay dictionary

91. The analysis of embodied communicative feedback in multimodal corpora: A prerequisite for behavior simulation

92. Yule's characteristic K revisited

93. Archaeological data models and web publication using XML

94. HamleDT: Harmonized multi-language dependency treebank

95. Supervised collaboration for syntactic annotation of Quranic Arabic

96. Is singular value decomposition useful for word similarity extraction?

97. Alignment-based extraction of multiword expressions

98. Multilingual collocation extraction with a syntactic parser

99. Copy detection in Chinese documents using Ferret

100. Cross-lingual sense determination: Can it work?

101. Access to pictorial material: A review of current research and future prospects

102. Spontaneous speech and opinion detection: Mining call-centre transcripts

103. Phonetically rich and balanced text and speech corpora for Arabic language

104. Methodology and construction of the Basque WordNet

105. Multiword expressions: Hard going or plain sailing?

106. Product named entity recognition in Chinese text

107. Dimensionality of dialogue act tagsets: An empirical analysis of large corpora

108. Reader-based exploration of lexical cohesion

109. Fact distribution in Information Extraction

110. Gore galore: Literary theory and computer games

111. The linguistic design of the EuroWordNet database

112. An Estonian morphological analyser and the impact of a corpus on its development

113. An overview of the European Union’s highly multilingual parallel corpora

114. Evaluating word sense induction and disambiguation methods

115. WHAD: Wikipedia historical attributes data: Historical structured data extraction and vandalism detection from the Wikipedia edit history

116. Classifying unlabeled short texts using a fuzzy declarative approach

117. A real time Named Entity Recognition system for Arabic text mining

118. Resources for Turkish morphological processing

119. Annotation of multiword expressions in the Prague dependency treebank

120. Valence extraction using EM selection and co-occurrence matrices

121. Normalization of Chinese chat language

122. LTAG-spinal and the Treebank: A new resource for incremental, dependency and semantic parsing

123. The GOLD Community of Practice: An infrastructure for linguistic data on the Web

124. The Linguistic Annotation Framework: a standard for annotation interchange and merging

125. Fine-grained Dutch named entity recognition

126. InSight Interaction: a multimodal and multifocal dialogue corpus

127. Evaluating and automating the annotation of a learner corpus

128. Analyzing the capabilities of crowdsourcing services for text summarization

129. Is there a language of sentiment? An analysis of lexical resources for sentiment analysis

130. EmoTales: Creating a corpus of folk tales with emotional annotations

131. Constructing and utilizing wordnets using statistical methods

132. Question answering at the cross-language evaluation forum 2003-2010

133. Overcoming statistical machine translation limitations: Error analysis and proposed solutions for the Catalan-Spanish language pair

134. Compilation of an idiom example database for supervised idiom identification

135. Multilingual language resources and interoperability

136. Automatic building of an ontology on the basis of text corpora in Thai

137. From the field to the web: Implementing best-practice recommendations in documentary linguistics

138. Tagging Icelandic text: An experiment with integrations and combinations of taggers

139. Adaptation of an automotive dialogue system to users' expertise and evaluation of the system

140. Applying EuroWordNet to cross-language text retrieval

141. Traditional and emotional stylometric analysis of the songs of beatles Paul McCartney and John Lennon

142. A qualitative comparison method for rhetorical structures: identifying different discourse structures in multilingual corpora

143. Building the essential resources for Finnish: the Turku Dependency Treebank

144. Is it possible to create a very large wordnet in 100 days? An evaluation

145. Large, huge or gigantic? Identifying and encoding intensity relations among adjectives in WordNet

146. The Spanish DELPH-IN grammar

147. Semi-automatic enrichment of crowdsourced synonymy networks: The WISIGOTH system applied to Wiktionary

148. The MATCH corpus: A corpus of older and younger users' interactions with spoken dialogue systems

149. Irony in a judicial debate: Analyzing the subtleties of irony while testing the subtleties of an annotation scheme

150. Automatic induction of language model data for a spoken dialogue system

151. Can we talk? Methods for evaluation and training of spoken dialogue systems

152. Digital facsimiles: Reading the William Blake archive

153. Peeling an onion: The lexicographer's experience of manual sense-tagging

154. Computers and resource-based history teaching: A UK perspective

155. Using the right tools: Enhancing retrieval from marked-up documents

156. CityU corpus of essay drafts of English language learners: a corpus of textual revision in second language writing

157. The good, the bad and the implicit: a comprehensive approach to annotating explicit and implicit sentiment

158. The Chinese Discourse TreeBank: a Chinese corpus annotated with discourse relations

159. Bucking the trend: Improved evaluation and annotation practices for ESL error detection systems

160. The Romanian wordnet in a nutshell

161. Twitter n-gram corpus with demographic metadata

162. Collective intelligence and language resources: Introduction to the special issue on collaboratively constructed language resources

163. Multiplicity and word sense: Evaluating and learning from multiply labeled word sense annotations

164. Annotation of sentence structure: Capturing the relationship between clauses in Czech sentences

165. Collecting and evaluating speech recognition corpora for 11 South African languages

166. Statistical unicodification of African languages

167. DuELME: A Dutch electronic lexicon of multiword expressions

168. WOZ acoustic data collection for interactive TV

169. Exploring interoperability of language resources: The case of cross-lingual semi-automatic enrichment of wordnets

170. Lexical systems: Graph models of natural language lexicons

171. The Hinoki syntactic and semantic treebank of Japanese (Language Resources and Evaluation DOI: 10.1007/s10579-007-9036-6)

172. The importance of gaze and gesture in interactive multimodal explanation

173. Urdu in a parallel grammar development environment

174. Annotating discourse markers in spontaneous speech corpora on an example for the Slovenian language

175. Automatically learning semantic knowledge about multiword predicates

176. The Hinoki syntactic and semantic treebank of Japanese

177. Complex predicates in Indian languages and wordnets

178. Automatically generating related queries in Japanese

179. Detecting Japanese idioms with a linguistically rich dictionary

180. How to measure the meanings of words? amour in Corneille's work

181. The role of inference in the temporal annotation and analysis of text

182. Some of my best friends are linguists

183. Statistical morphological disambiguation for agglutinative languages

184. Pattern processing in melodic sequences: Challenges, caveats and prospects

185. Wag the dog? Online conferencing and teaching

186. Senseval: The CL research experience

187. Discovering Buffalo story robes: A case for cross-domain information strategies

188. Cross-linguistic alignment of WordNets with an inter-lingual-index

189. Developing a successful SemEval task in sentiment analysis of Twitter and other social media texts

190. A massively parallel corpus: the Bible in 100 languages

191. Multimodal corpus of multiparty conversations in L1 and L2 languages and findings obtained from it

192. Automatic dialogue act recognition with syntactic features

193. Text simplification resources for Spanish

194. Capturing divergence in dependency trees to improve syntactic projection

195. TypeCraft collaborative databasing and resource sharing for linguists

196. Introduction to the special issue: On wordnets and relations

197. Tailoring the automated construction of large-scale taxonomies using the web

198. Beyond sentence-level semantic role labeling: Linking argument structures in discourse

199. Coreference resolution: An empirical study based on SemEval-2010 shared Task 1

200. An open diachronic corpus of historical Spanish

Back

Back to homepage

"Most Cited"

Saturday, July 15, 2017

Language Reources and Evaluation

The 200 most cited articles

No comments:

Followers

Total Pageviews

Contact us