Download PDFOpen PDF in browser

An Interaction Approach Between Services for Extracting Relevant Data From Tweets Corpora

14 pagesPublished: November 28, 2016

Abstract

We present a system based on the need of special infrastructure adequate to software agents to operate, to compose and make sense from the contents of the Web resources through the development of a multi-agent system oriented services interactions. Our method follows the different construction ontology techniques and updates them by extracting new terms and integrate them to the ontology.It is based on the detection phrases via the ontological database DBPedia. The system treats each syntagme extracted from the corpus of messages and verifies whether it is possible to associate them directly to a DBPedia knowledge. In case of failure, these service agents interact with each other in order to find the best possible answer to the problem, by operating directly in the phrase, trying to semantically modify it, until the association with ontological knowledge becomes possible. The advantage of our approach is its modularity : it is both possible to add / modify / delete a service or define a new one, and then influence the outcome product. We could compare the results extracted from a heterogeneous body of messages from the Twitter social network with Tagme method, based mainly on storage and annotation of encyclopaedic corpus.

Keyphrases: big data, context oriented approach, multi agent based simulation, nlp, social network, twitter, web service

In: Antonio Moreno Ortiz and Chantal Pérez-Hernández (editors). CILC2016. 8th International Conference on Corpus Linguistics, vol 1, pages 97-110.

BibTeX entry
@inproceedings{CILC2016:Interaction_Approach_Between_Services,
  author    = {Mehdy Dref and Anna Pappa},
  title     = {An Interaction Approach Between Services for Extracting Relevant Data From Tweets Corpora},
  booktitle = {CILC2016. 8th International Conference on Corpus Linguistics},
  editor    = {Antonio Moreno Ortiz and Chantal Pérez-Hernández},
  series    = {EPiC Series in Language and Linguistics},
  volume    = {1},
  publisher = {EasyChair},
  bibsource = {EasyChair, https://easychair.org},
  issn      = {2398-5283},
  url       = {/publications/paper/QHxp},
  doi       = {10.29007/4tbj},
  pages     = {97-110},
  year      = {2016}}
Download PDFOpen PDF in browser