Graphia is a framework which extracts structured data graphs from factual unstructured texts. Instead of extracting simple relations, or committing to a specific conceptual model, Graphia aims at the extraction of graphs which can represent the complexity of contexts present in texts.
The graph representation adopted by the framework (SDGs – Structured Discourse Graphs) can be naturally serialized as an entity-centric RDF graph, which facilitates the integration and the use of the graph with other resources and applications. Additionally, the graph representation supports a pay-as-you-go / semantic best-effort extraction, where a comprehensive extraction is prioritized over accuracy and where the quality of the extracted graph evolves over time.
Examples of extracted graphs can be found here.
Features included in the framework:
- Structured Discourse Graph extraction and visualization.
- Named entity resolution to DBpedia entities
- Co-reference resolution
- Serialization as RDF