The 200 most cited articles
1. Annotating
expressions of opinions and emotions in language
2. The waCky
wide web: A collection of very large linguistically processed web-crawled
corpora
3. IEMOCAP:
Interactive emotional dyadic motion capture database
4. Unleashing
the killer corpus: Experiences in creating the multi-everything AMI Meeting
Corpus
5. How
variable may a constant be? Measures of lexical richness in perspective
6. The MUMIN
coding scheme for the annotation of feedback, turn management and sequencing
phenomena
7. A
large-scale classification of English verbs
8. Authorship
attribution in the wild
9. A
multidimensional approach for detecting irony in Twitter
10. I don't
believe in word senses
11. Cross-language
plagiarism detection
12. Factbank:
A corpus annotated with event factuality
13. Lexical
association measures and collocation extraction
14. Computer-based
authorship attribution without lexical measures
15. Intrinsic
plagiarism analysis
16. The
English lexical substitution task
17. Temporal
and event information in natural language text
18. The
tempEval challenge: Identifying temporal relations in text
19. The
challenge of optical music recognition
20. Framework
and results for English SENSEVAL
21. Neural
network applications in stylometry: The federalist papers
22. Multilingual
and cross-domain temporal tagging
23. Interchanging
lexical resources on the Semantic Web
24. A semantic
network of English: The mother of all WordNets
25. A
web-based Bengali news corpus for named entity recognition
26. Developing
a corpus of plagiarised short answers
27. Language
resources for Hebrew
28. An
annotation scheme for conversational gestures: How to economically capture
timing and form
29. The CHIL
audiovisual corpus for lecture and meeting analysis inside smart rooms
30. Comparative
evaluation of text classification techniques using a large diverse Arabic
dataset
31. Perspectives
on crowdsourcing annotations for natural language processing
32. Microblog
language identification: Overcoming the limitations of short, unedited and
idiomatic text
33. Lightweight
methods to estimate influenza rates and alcohol sales volume from Twitter
messages
34. The TORGO
database of acoustic and articulatory speech from speakers with dysarthria
35. AnCora-CO:
Coreferentially annotated corpora for Spanish and Catalan
36. The
NXT-format Switchboard Corpus: A rich resource for investigating the syntax,
semantics, pragmatics and prosody of dialogue
37. Classification
of semantic relations between nominals
38. The NITE
XML toolkit: Data model and query language
39. A
multimodal annotated corpus of consensus decision making meetings
40. The ACL
anthology network corpus
41. A multilingual
ontology for infectious disease surveillance: Rationale, design and
challenges
42. Introduction
to EuroWordNet
43. Virtual
agent multimodal mimicry of humans
44. Unsupervised
morphological parsing of Bengali
45. Automatic
keyphrase extraction from scientific articles
46. Creating a
live, public short message service corpus: The NUS SMS corpus
47. Creating a
system for lexical substitutions from scratch using crowdsourcing
48. Thesaurus
or logical ontology, which one do we need for text mining?
49. The bible
as a parallel corpus: Annotating the "book of 2000 tongues"
50. WordNet
then and now
51. MULTEXT-East:
Morphosyntactic resources for Central and Eastern European languages
52. Corpus-based
generation of head and eyebrow motion for an embodied conversational agent
53. Guidelines
for word alignment evaluation and manual alignment
54. On the
evaluation and improvement of Arabic WordNet coverage and usability
55. Lessons
from building a Persian written corpus: Peykare
56. Japanese/english
cross-language information retrieval: Exploration of query translation and
transliteration
57. SpatialML:
Annotation scheme, resources, and evaluation
58. Compositionality
and lexical alignment of multi-word terms
59. TimeBank
evolution as a community resource for TimeML parsing
60. The
chicken-and-egg problem in wordnet design: Synonymy, synsets and constitutive
relations
61. The Corpus
DIMEx100: Transcription and evaluation
62. Temporal
closure in an annotation environment
63. Hierarchical
decision lists for word sense disambiguation
64. Balanced
corpus of contemporary written Japanese
65. Dannet:
The challenge of compiling a wordnet for Danish by reusing a monolingual
dictionary
66. Getting to
the heart of the matter: Speech as the expression of affect; Rather than just
text or language
67. Do word
meanings exist?
68. The
top-down strategy for building EuroWordNet: Vocabulary coverage, base concepts
and top ontology
69. The state
of authorship attribution studies: Some problems and solutions
70. ECO and
Onto.PT: A flexible approach for creating a Portuguese wordnet
automatically
71. IPLR: An
online resource for Greek word-level and sublexical information
72. Improving
English verb sense disambiguation performance with linguistically motivated
features and clear sense distinction boundaries
73. Evaluation
of machine learning-based information extraction algorithms: Criticisms and
recommendations
74. A novel
approach for ranking spelling error corrections for Urdu
75. And then
there were none: Winnowing the shakespeare claimants
76. GATE
Teamware: A web-based, collaborative text annotation framework
77. Glissando:
A corpus for multidisciplinary prosodic studies in Spanish and Catalan
78. Alcohol
language corpus: The first public corpus of alcoholized German speech
79. Annotating
expressions of Appraisal in English
80. A survey
of methods to ease the development of highly multilingual text mining
applications
81. The
Hamburg Metaphor Database project: Issues in resource creation
82. A corpus
for studying addressing behaviour in multi-party dialogues
83. Stephen
Crane and the New-York Tribune: A case study in traditional and non-traditional
authorship attribution
84. SALDO: A
touch of yin to WordNet's yang
85. Coupling
an annotated corpus and a lexicon for state-of-the-art POS tagging
86. FrameNet,
current collaborations and future goals
87. Challenges
for a multilingual wordnet
88. Data and
models for metonymy resolution
89. Multilingual
resources for NLP in the lexical markup framework (LMF)
90. Combining
linguistic resources to create a machine-tractable Japanese-Malay
dictionary
91. The
analysis of embodied communicative feedback in multimodal corpora: A
prerequisite for behavior simulation
92. Yule's
characteristic K revisited
93. Archaeological
data models and web publication using XML
94. HamleDT:
Harmonized multi-language dependency treebank
95. Supervised
collaboration for syntactic annotation of Quranic Arabic
96. Is
singular value decomposition useful for word similarity extraction?
97. Alignment-based
extraction of multiword expressions
98. Multilingual
collocation extraction with a syntactic parser
99. Copy
detection in Chinese documents using Ferret
100. Cross-lingual
sense determination: Can it work?
101. Access to
pictorial material: A review of current research and future prospects
102. Spontaneous
speech and opinion detection: Mining call-centre transcripts
103. Phonetically
rich and balanced text and speech corpora for Arabic language
104. Methodology
and construction of the Basque WordNet
105. Multiword
expressions: Hard going or plain sailing?
106. Product
named entity recognition in Chinese text
107. Dimensionality
of dialogue act tagsets: An empirical analysis of large corpora
108. Reader-based
exploration of lexical cohesion
109. Fact
distribution in Information Extraction
110. Gore
galore: Literary theory and computer games
111. The
linguistic design of the EuroWordNet database
112. An Estonian
morphological analyser and the impact of a corpus on its development
113. An overview
of the European Union’s highly multilingual parallel corpora
114. Evaluating
word sense induction and disambiguation methods
115. WHAD:
Wikipedia historical attributes data: Historical structured data extraction and
vandalism detection from the Wikipedia edit history
116. Classifying
unlabeled short texts using a fuzzy declarative approach
117. A real time
Named Entity Recognition system for Arabic text mining
118. Resources
for Turkish morphological processing
119. Annotation
of multiword expressions in the Prague dependency treebank
120. Valence
extraction using EM selection and co-occurrence matrices
121. Normalization
of Chinese chat language
122. LTAG-spinal
and the Treebank: A new resource for incremental, dependency and semantic
parsing
123. The GOLD
Community of Practice: An infrastructure for linguistic data on the Web
124. The
Linguistic Annotation Framework: a standard for annotation interchange and
merging
125. Fine-grained
Dutch named entity recognition
126. InSight
Interaction: a multimodal and multifocal dialogue corpus
127. Evaluating
and automating the annotation of a learner corpus
128. Analyzing
the capabilities of crowdsourcing services for text summarization
129. Is there a
language of sentiment? An analysis of lexical resources for sentiment
analysis
130. EmoTales:
Creating a corpus of folk tales with emotional annotations
131. Constructing
and utilizing wordnets using statistical methods
132. Question
answering at the cross-language evaluation forum 2003-2010
133. Overcoming
statistical machine translation limitations: Error analysis and proposed
solutions for the Catalan-Spanish language pair
134. Compilation
of an idiom example database for supervised idiom identification
135. Multilingual
language resources and interoperability
136. Automatic
building of an ontology on the basis of text corpora in Thai
137. From the
field to the web: Implementing best-practice recommendations in documentary
linguistics
138. Tagging
Icelandic text: An experiment with integrations and combinations of
taggers
139. Adaptation
of an automotive dialogue system to users' expertise and evaluation of the
system
140. Applying
EuroWordNet to cross-language text retrieval
141. Traditional
and emotional stylometric analysis of the songs of beatles Paul McCartney and
John Lennon
142. A
qualitative comparison method for rhetorical structures: identifying different
discourse structures in multilingual corpora
143. Building
the essential resources for Finnish: the Turku Dependency Treebank
144. Is it
possible to create a very large wordnet in 100 days? An evaluation
145. Large, huge
or gigantic? Identifying and encoding intensity relations among adjectives in
WordNet
146. The Spanish
DELPH-IN grammar
147. Semi-automatic
enrichment of crowdsourced synonymy networks: The WISIGOTH system applied to
Wiktionary
148. The MATCH
corpus: A corpus of older and younger users' interactions with spoken dialogue
systems
149. Irony in a
judicial debate: Analyzing the subtleties of irony while testing the subtleties
of an annotation scheme
150. Automatic
induction of language model data for a spoken dialogue system
151. Can we
talk? Methods for evaluation and training of spoken dialogue systems
152. Digital
facsimiles: Reading the William Blake archive
153. Peeling an
onion: The lexicographer's experience of manual sense-tagging
154. Computers
and resource-based history teaching: A UK perspective
155. Using the
right tools: Enhancing retrieval from marked-up documents
156. CityU
corpus of essay drafts of English language learners: a corpus of textual
revision in second language writing
157. The good,
the bad and the implicit: a comprehensive approach to annotating explicit and
implicit sentiment
158. The Chinese
Discourse TreeBank: a Chinese corpus annotated with discourse relations
159. Bucking the
trend: Improved evaluation and annotation practices for ESL error detection
systems
160. The
Romanian wordnet in a nutshell
161. Twitter
n-gram corpus with demographic metadata
162. Collective
intelligence and language resources: Introduction to the special issue on
collaboratively constructed language resources
163. Multiplicity
and word sense: Evaluating and learning from multiply labeled word sense
annotations
164. Annotation
of sentence structure: Capturing the relationship between clauses in Czech
sentences
165. Collecting
and evaluating speech recognition corpora for 11 South African languages
166. Statistical
unicodification of African languages
167. DuELME: A
Dutch electronic lexicon of multiword expressions
168. WOZ
acoustic data collection for interactive TV
169. Exploring
interoperability of language resources: The case of cross-lingual
semi-automatic enrichment of wordnets
170. Lexical
systems: Graph models of natural language lexicons
171. The Hinoki
syntactic and semantic treebank of Japanese (Language Resources and Evaluation
DOI: 10.1007/s10579-007-9036-6)
172. The
importance of gaze and gesture in interactive multimodal explanation
173. Urdu in a
parallel grammar development environment
174. Annotating
discourse markers in spontaneous speech corpora on an example for the Slovenian
language
175. Automatically
learning semantic knowledge about multiword predicates
176. The Hinoki
syntactic and semantic treebank of Japanese
177. Complex
predicates in Indian languages and wordnets
178. Automatically
generating related queries in Japanese
179. Detecting
Japanese idioms with a linguistically rich dictionary
180. How to
measure the meanings of words? amour in Corneille's work
181. The role of
inference in the temporal annotation and analysis of text
182. Some of my
best friends are linguists
183. Statistical
morphological disambiguation for agglutinative languages
184. Pattern
processing in melodic sequences: Challenges, caveats and prospects
185. Wag the
dog? Online conferencing and teaching
186. Senseval:
The CL research experience
187. Discovering
Buffalo story robes: A case for cross-domain information strategies
188. Cross-linguistic
alignment of WordNets with an inter-lingual-index
189. Developing
a successful SemEval task in sentiment analysis of Twitter and other social
media texts
190. A massively
parallel corpus: the Bible in 100 languages
191. Multimodal
corpus of multiparty conversations in L1 and L2 languages and findings obtained
from it
192. Automatic
dialogue act recognition with syntactic features
193. Text
simplification resources for Spanish
194. Capturing
divergence in dependency trees to improve syntactic projection
195. TypeCraft
collaborative databasing and resource sharing for linguists
196. Introduction
to the special issue: On wordnets and relations
197. Tailoring
the automated construction of large-scale taxonomies using the web
198. Beyond
sentence-level semantic role labeling: Linking argument structures in
discourse
199. Coreference
resolution: An empirical study based on SemEval-2010 shared Task 1
200. An open
diachronic corpus of historical Spanish
No comments:
Post a Comment