Link Search Menu Expand Document

Knowledge Resources for Common Sense

Editors: Filip Ilievski, Yang Qiao, Bill Yuchen Lin

Many resources for commonsense reasoning have been developed, spanning various acquisition methods, representations, and intended applications. Here we group key resources based on the type of knowledge they capture.


Table of contents

Tags:

  • s|symbolic: symbolic knowledge resources, usually in graph structures.
    • s-t|symbolic-triple: RDF-style triples, e.g., (apple, HasProperty, round)
  • n|neural: neural knowledge resources, typically a neural language model.
  • c|corpus: an unstructured knowledge corpus consisting of commonsense facts.
  • o|object: object-centric knowledge.
  • e|event: event-centric knowledge.
  • so|social: social commonsense knowledge.
  • ph|physical: physical commonsense knowledge.
  • a|auto-extracted: automatically extracted from the web.
  • h|human-annotated: annotated by humans.
  • ex|explanation: explanation-based knowledge resource.


Commonsense Knowledge

These are sources that have been deliberately created to capture (either wide or narrow domain) commonsense knowledge.

📜 [ConceptNet] Conceptnet 5.5: An open multilingual graph of general knowledge.
✍ Speer, R., Chin, J., & Havasi, C. (AAAI 2017)

Paper Data Explore

  • Tags:
  • Size: Around 1.6 million edges connecting more than 300,000 nodes.
  • Creation:
Illustrative Example


📜 [ATOMIC] Atomic: An atlas of machine commonsense for if-then reasoning.
✍ Sap, M., Le Bras, R., Allaway, E., Bhagavatula, C., Lourie, N., Rashkin, H., Roof, B., Smith, N.A. and Choi, Y. (AAAI 2019)

Paper Data Explore

  • Tags:
  • Size: Around 877k textual descriptions of inferential knowledge.
  • Creation:
Illustrative Example
   Event: "PersonX uses PersonX's ___ to obtain"
      oEffect: []
      oReact: ['annoyed', 'angry', 'worried']
      oWant: []
      prefix: ['uses', 'obtain']
      split: 'trn'
      xAttr: []
      xEffect: []
      xIntent: ['to have an advantage', 'to fulfill a desire', 'to get out of trouble']
      xNeed: []
      xReact: ['pleased', 'smug', 'excited']
      xWant: []


📜 [ATOMIC2020] Comet-atomic 2020: On symbolic and neural commonsense knowledge graphs.
✍ Hwang, J. D., Bhagavatula, C., Bras, R. L., Da, J., Sakaguchi, K., Bosselut, A., & Choi, Y. (arXiv 2020)

Paper Data

  • Tags:
  • Size: Around 1.33M everyday inferential knowledge tuples about entities and events.
  • Creation:
Illustrative Example


📜 [GLUCOSE] Glucose: Generalized and contextualized story explanations.
✍ Mostafazadeh, N., Kalyanpur, A., Moon, L., Buchanan, D., Berkowitz, L., Biran, O., & Chu-Carroll, J. (EMNLP 2020)

Paper Data

  • Tags:
  • Size: More than 670K (335K pair) of GLUCOSE annotations.
  • Creation:
Illustrative Example

Entries in the GLUCOSE dataset that explain the Gage story around the sentence X= Gage turned his bike sharply.


📜 [WebChild] Webchild 2.0: Fine-grained commonsense knowledge distillation.
✍ Tandon, N., De Melo, G., & Weikum, G. (ACL 2017)

Paper Data Explore

  • Tags:
  • Size: Over 2 million disambiguated concepts and activities, connected by over 18 million assertions.
  • Creation:
Illustrative Example
   #word: animal
   sense-number: 1
   WordNet-synsetid: 100015388
   Definition (WordNet gloss): a living organism characterized by voluntary movement

WebChild 2.0 browser results for animal.


📜 [QuasimodoKB] Commonsense properties from query logs and question answering forums.
✍ Romero, J., Razniewski, S., Pal, K., Z. Pan, J., Sakhadeo, A., & Weikum, G. (CIKM 2019)

Paper Data Explore

  • Tags:
  • Size: Include 80,145 subjects, 78,636 predicates, and 2,262,109 triples.
  • Creation:
Illustrative Example

Quasimodo browser results for eggplant.


📜 [ASCENT] Advanced Semantics for Commonsense Knowledge Extraction.
✍ Nguyen, T. P., Razniewski, S., & Weikum, G. (arXiv 2020)

Paper Data Explore

  • Tags:
  • Size: Contain more than 284,000 subgroups and 92,000 related aspects, with 8.6 million assertions and 4.4 million facets in total.
  • Creation:
Illustrative Example

Example of Ascent’s knowledge for the concept elephant.


📜 [SenticNet] SenticNet 5: Discovering conceptual primitives for sentiment analysis by means of context embeddings.
✍ Cambria, E., Poria, S., Hazarika, D., & Kwok, K. (AAAI 2018)

Paper Data Explore

  • Tags:
  • Size: Contain around 100,000 commonsense concepts.
  • Creation:
Illustrative Example
   concept_name: intact
   introspection_value: 0.716,  temper_value: -0.62,  attitude_value: 0,  sensitivity_value: 0.896
   primary_mood: joy,  secondary_mood: eagerness,  polarity_label: positive,  polarity_value: 0.328
   semantics1: constitutional,  semantics2: intrinsic,  semantics3: intimate,  semantics4: inner,  semantics5: inbuilt

A sketch of SenticNet 5’s graph showing part of the semantic network for the primitive INTACT.


📜 [HasPartKB] Do dogs have whiskers? a new knowledge base of haspart relations.
✍ Bhakthavatsalam, S., Richardson, K., Tandon, N., & Clark, P. (arXiv 2020)

Paper Data

  • Tags:
  • Size: Contain 50,752 commonsense concepts in total, and a subset of 15,200 concepts is within a Fifth Grade vocabulary.
  • Creation:
Illustrative Example
   arg1: snowdrop,  metadata: wikipedia_primary_page -- Galanthus
   arg2: carpel,  metadata: synset -- wn.carpel.n.01
   average_score: 0.9990746974945068
   matches: some carpels are part of snowdrops.

📜 [ASER] ASER: A Large-scale Eventuality Knowledge Graph.
✍ Hongming Zhang, Xin Liu, Haojie Pan, Yangqiu Song, Cane Wing-Ki Leung. (WWW 2020)

Paper Data Github

  • Tags:
  • Size: In total, ASER contains 194 million unique eventualities. After bootstrapping, ASER contains 64 million edges among eventualities.
  • Creation:
Illustrative Example

An example image of PersonX acts quickly from the COMET dataset.


📜 [CYC] CYC: A large-scale investment in knowledge infrastructure.
✍ Lenat, D. (Communications of the ACM 1995)

Paper Data

Note that the data link is from OpenCyc, which is a subset of Cyc. The entire Cyc is not publicly available.

  • Tags:
  • Size: The size is unavailable now since the entire CYC is not released publicly.
  • Creation:
Illustrative Example

Sample assertions of everyday life and objects spanned by the domain of CYC:

   • You have to be awake to eat.
   • You can usually see people’s noses, but not their hearts.
   • Given two professions, either one is a specialization of the other or else they are likely to be independent of one another.
   • You cannot remember events that have not happened yet.
   • If you cut a lump of peanut butter in half, each half is also a lump of peanut butter; but if you cut a table in half, neither half is a table.

📜 [COMET] Comet: Commonsense transformers for automatic knowledge graph construction.
✍ Bosselut, A., Rashkin, H., Sap, M., Malaviya, C., Celikyilmaz, A., & Choi, Y. (ACL 2019)

Paper Explore Github

  • Tags:
  • Size: The size is unavailable since COMET is an automatic knowledge base construction based on ATOMIC and ConceptNet.
  • Creation:
Illustrative Example

An example image of PersonX acts quickly from the COMET dataset.


📜 [GenericsKB] Genericskb: A knowledge base of generic statements.
✍ Bhakthavatsalam, S., Anastasiades, C., & Clark, P. (arXiv 2020)

Paper Data

  • Tags:
  • Size: Contain around 3.5M+ generic sentence.
  • Creation:
Illustrative Example

Example generics about tree in GENERICSKB:

   1. Trees are perennial plants that have long woody trunks.
   2. Trees are woody plants which continue growing until they die.
   3. Most trees add one new ring for each year of growth.
   4. Trees produce oxygen by absorbing carbon dioxide from the air.
   5. Trees are large, generally single-stemmed, woody plants.
   6. Trees live in cavities or hollows.
   7. Trees grow using photosynthesis, absorbing carbon dioxide and releasing oxygen.

An example entry, including metadata

   Term: tree
   Sent: Most trees add one new ring for each year of growth.
   Quantifier: Most
   Score: 0.35
   Before: ...Notice how the extractor holds the core as it is removed from inside the hollow center of the bit. Tree cores are extracted with an increment borer.
   After: The width of each annual ring may be a reflection of forest stand dynamics. Dendrochronology, the study of annual growth rings, has become prominent in ecology...

📜 [InScript] InScript: Narrative texts annotated with script information.
✍ Ashutosh Modi, Tatjana Anikina, Simon Ostermann, Manfred Pinkal. (LREC 2016)

Paper Data

  • Tags:
  • Size: Contain a corpus of 1,000 stories centered around 10 different scenarios, giving a total of 1,000 stories with about 200,000 words.
  • Creation:
Illustrative Example

An excerpt from a story on TAKING A BATH script.

   I was sitting on my couch when I decided that I hadn’t taken a bath in a while so I stood up and walked to the bathroom where I turned on the faucet in the sink and began filling the bath with hot water. 
   While the tub was filling with hot water I put some bubble bath into the stream of hot water coming out of the faucet so that the tub filled with not only hot water[...]

📜 [Social Chemistry 101] Social Chemistry 101: Learning to Reason about Social and Moral Norms.
✍ Maxwell Forbes, Jena D. Hwang, Vered Shwartz, Maarten Sap, Yejin Choi. (EMNLP 2020)

Paper Data

  • Tags:
  • Size:
  • Creation:
Illustrative Example
SITUATION
Narrator: "Asking my boyfriend to stop being friends with his ex"
RULE-OF-THUMB 1 for Narrator
It's okay to ask your significant other to stop doing something you're uncomfortable with
RULE-OF-THUMB 2 for Narrator
It's not right to tell another person who to spend time with

Common Knowledge

These sources contain general-domain knowledge, known to many people, but not limited to commonsense knowledge. They include, for instance, knowledge abour entities and events.

📜 [Wikidata] Wikidata: a free collaborative knowledgebase.
✍ Vrandečić, D., & Krötzsch, M. (ACM 2014)

Paper Data Explore

  • Tags:
  • Size: Approximately 14.5 million items and 36 million language links.
  • Creation:
Illustrative Example

Wikidata browser results for animal.


📜 [Wikidata-CS] Commonsense knowledge in Wikidata.
✍ Ilievski, F., Szekely, P., & Schwabe, D. (ISWC Wikidata workshop 2020)

Paper Data

  • Tags:
  • Size: Contain 71,243 items and 106,103 language links.
  • Creation:
Illustrative Example
   node1: Q1203797,  label: bicycle
   relation: /r/IsA
   node2: Q2207288,  label: messenger
   label relation: instance of

📜 [YAGO] Yago 4: A reason-able knowledge base.
✍ Tanon, T. P., Weikum, G., & Suchanek, F. (ESWC 2020)

Paper Data Explore

  • Tags:
  • Size: Contain 10,124 classes, 303M labels, 1,399M descriptions, 68M aliases, and 343M facts.
  • Creation:
Illustrative Example

YAGO browser results for animal.


📜 [SUMO] Towards a standard upper ontology.
✍ Niles, I., & Pease, A. (ICFOIS 2001)

Paper Data Explore

  • Tags:
  • Size: Approximately 25,000 terms and 80,000 axioms when all domain ontologies are combined.
  • Creation:
Illustrative Example

SUMO browser results for animal.


📜 [DOLCE] Sweetening ontologies with DOLCE.
✍ Gangemi, A., Guarino, N., Masolo, C., Oltramari, A., & Schneider, L. (ICKEKM 2002)

Paper Data

  • Tags:
  • Size: The size is unavailable.
  • Creation:
Illustrative Example

Taxonomy of DOLCE basic categories:

Examples of leaf basic categories:


📜 [NELL] Never-ending learning.
✍ T. Mitchell, W. Cohen, E. Hruschka, P. Talukdar, B. Yang, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner, B. Kisiel, J.Krishnamurthy, N. Lao, K. Mazaitis, T. Mohamed, N. Nakashole, E. Platanios, A. Ritter, M. Samadi, B. Settles, R. Wang, D. Wijaya, A. Gupta, X. Chen, A. Saparov, M. Greaves, and J. Welling. (AAAI 2015)

Paper Explore

  • Tags:
  • NELL’S Learning Result: KB with ~120 million confidence; weighted beliefs: learned to improve its reading ability, its reasoning ability, its learning ability; extended its ontology of known relations.
  • Creation:
Illustrative Example

NELL Learned Contexts for “Hotel” (~1% of total)

   "_ is the only five  -star hotel” "_ is the only hotel” "_ is the perfect accommodation" "_ is the perfect address” "_ is the perfect lodging” "_ is the sister hotel” "_ is the ultimate hotel" "_ is the value choice” "_ is uniquely situated in” "_ is Walking Distance” "_ is wonderfully situated in” "_ las vegas hotel” "_ los angeles hotels” "_ Make an online hotel reservation” "_ makes a great home  -base” "_ mentions Downtown” "_ mette a disposizione” "_ miami south beach” "_ minded traveler” "_ mucha prague Map Hotel” "_ n'est qu'quelques minutes” "_ naturally has a pool” "_ is the perfect central location” "_ is the perfect extended stay hotel” "_ is the perfect headquarters” "_ is the perfect home base” "_ is the perfect lodging choice" "_ north reddington beach” "_ now offer guests” "_ now offers guests” "_ occupies a privileged location” "_ occupies an ideal location” "_ offer a king bed” "_ offer a large bedroom” "_ offer a master bedroom” "_ offer a refrigerator” "_ offer a separate living area" "_ offer a separate living room” "_ offer comfortable rooms” "_ offer complimentary shuttle service” "_ offer deluxe accommodations” "_ offer family rooms” "_ offer secure online reservations” "_ offer upscale amenities” "_ offering a complimentary continental breakfast” "_ offering

Lexical Knowledge

These sources contain knowledge about words, their meaning, and their relations to other words.

📜 [WordNet] WordNet: a lexical database for English.
✍ Miller, G. (ACM 1995)

Paper Data Explore

  • Tags:
  • Size: 16MB (including 155,327 words organized in 175,979 synsets for a total of 207,016 word-sense pairs).
  • Creation:
Illustrative Example

WordNet browser results for bicycle.


📜 [FrameNet] The berkeley framenet project.
✍ Baker, C. F., Fillmore, C. J., & Lowe, J. B. (ACL 1998)

Paper Data Explore

  • Tags:
  • Size: The size is unavailable.
  • Creation:
Illustrative Example

FrameNet browser results for abandonment.


📜 [MetaNet] MetaNet: Deep semantic automatic metaphor analysis.
✍ Dodge, E. K., Hong, J., & Stickles, E. (Metaphor in NLP workshop 2015)

Paper Explore

  • Tags:
  • Size: The size is unavailable.
  • Creation:
Illustrative Example

MetaNet browser results for EMOTIONS AND OBJECTS.


📜 [VerbNet] VerbNet: A broad-coverage, comprehensive verb lexicon.
✍ Schuler, K. K. (Dissertation 2005)

Paper Data Explore

  • Tags:
  • Size: Approximately 5800 English verbs, and groups verbs according to shared syntactic behaviors, thereby revealing generalizations of verb behavior.
  • Creation:
Illustrative Example

VerbNet browser results for see.


Visual Knowledge

Visual knowledge sources make explicit the knowledge that is captured in images.

📜 [Visual Genome] Visual genome: Connecting language and vision using crowdsourced dense image annotations.
✍ Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., Chen, S., Kalantidis, Y., Li, L.J., Shamma, D.A., Bernstein, M.S. (IJCV 2017)

Paper Data Explore

  • Tags:
  • Size: 108,077 Images, 5.4 Million Region Descriptions, 1.7 Million Visual Question Answers, 3.8 Million Object Instances, 2.8 Million Attributes, and 2.3 Million Relationships.
  • Creation:
Illustrative Example

An example image of throwing frisbee from the Visual Genome dataset.


📜 [Flickr30k] Flickr30k entities: Collecting region-to-phrase correspondences for richer image-to-sentence models.
✍ Plummer, B. A., Wang, L., Cervantes, C. M., Caicedo, J. C., Hockenmaier, J., & Lazebnik, S. (ICCV 2015)

Paper Data

  • Tags:
  • Size: Contain 244k coreference chains and 276k manually annotated bounding boxes for each of the 31,783 images and 158,915 English captions (five per image) in the original dataset.
  • Creation:
Illustrative Example

Example images from the Flickr30k Entities dataset.


Consolidation & Surveys

Consolidation efforts consolidate existing sources into a single resource. Surveys provide a single theoretical framework about existing knowledge sources.

📜 [CSKG] CSKG: The CommonSense Knowledge Graph.
✍ Ilievski, F., Szekely, P., Zhang, B. (ESWC 2021)

Paper Data

  • Tags:
  • Size: The size is unavailable.
  • Creation:
Illustrative Example
   node1: person
   node2: architect
   label relation: /r/IsA
   sentence: architect is a person

An example graph from the CSKG dataset:


📜 Dimensions of commonsense knowledge.
✍ Ilievski, F., Oltramari, A., Ma, K., Zhang, B., McGuinness, D. L., Szekely, P. (arXiv 2021)

Paper Data

  • Tags:
  • Creation:
Illustrative Example

Examples for food for each of the 13 dimensions:


📜 [NextKB] Analogy and relational representations in the companion cognitive architecture.
✍ Forbus, K. D., & Hinrich, T. (AI Magazine 2017)

Paper Data

  • Tags:
  • Size: The size is unavailable.
  • Creation:

Cited as (TBD)

@electronic{commonsenserun,
  title   = "An Online Compendium for Commonsense Reasoning Research.",
  author  = "Lin, Bill Yuchen and Qiao, Yang and Ilievski, Filip and Zhou, Pei and Wang, Peifeng and Ren, Xiang", 
  journal = "commonsense.run",
  year    = "2021",
  url     = "https://commonsense.run"
}

Page last modified: Apr 24 2021.

Edit this page on GitHub.