clic.db.corpora: Resolve corpora lists¶
corpora_to_book_ids¶
There are various ways we can specify which books to search in the corpora parameter. They are specified below.
Make a very cut-down library of books, with the word “the” so we can search for all of them:
>>> from clic.db.corpus import put_corpus
>>> db_cur = test_database(
... alice='''
... Alice’s Adventures in Wonderland
... Lewis Carroll
...
... the
... '''.strip(),
...
... willows='''
... The Wind in the Willows
... Kenneth Grahame
...
... the
... '''.strip(),
...
... sense='''
... Sense and Sensibility
... Jane Austen
...
... the
... '''.strip(),
...
... northanger='''
... Northanger Abbey
... Jane Austen
...
... the
... '''.strip(),
...
... gulliver='''
... Gulliver's Travels
... Jonathan Swift
...
... the
... '''.strip())
>>> put_corpus(db_cur, dict(name='ChiLit', contents=['alice', 'willows'], title='', ordering=1))
>>> put_corpus(db_cur, dict(name='ArTs', contents=['sense', 'northanger', 'gulliver'], title='', ordering=2))
We will use concordance for these examples, but the choice shouldn’t matter:
>>> from ..concordance import concordance
book name¶
We can use book names directly:
>>> just_metadata(concordance(db_cur, ['alice', 'willows'], q=["the"], metadata=['book_titles']))
{'book_titles':
{'alice': ['Alice’s Adventures in Wonderland', 'Lewis Carroll'],
'willows': ['The Wind in the Willows', 'Kenneth Grahame']}}
author¶
We can use the author name to get any books by a given author:
>>> just_metadata(concordance(db_cur, ['author:Jane Austen'], q=["the"], metadata=['book_titles']))
{'book_titles':
{'northanger': ['Northanger Abbey', 'Jane Austen'],
'sense': ['Sense and Sensibility', 'Jane Austen']}}
corpus name¶
We can use the corpus name:
>>> just_metadata(concordance(db_cur, ['corpus:ChiLit'], q=["the"], metadata=['book_titles']))
{'book_titles':
{'alice': ['Alice’s Adventures in Wonderland', 'Lewis Carroll'],
'willows': ['The Wind in the Willows', 'Kenneth Grahame']}}
old names¶
There are some old names that whilst are deprecated, should still work:
>>> just_metadata(concordance(db_cur, ['Other'], q=["the"], metadata=['book_titles']))
{'book_titles':
{'gulliver': ["Gulliver's Travels", 'Jonathan Swift'],
'northanger': ['Northanger Abbey', 'Jane Austen'],
'sense': ['Sense and Sensibility', 'Jane Austen']}}
-
clic.db.corpora.
corpora_to_book_ids
(cur, corpora)¶ Resolve list of corpora into set of book IDs. Where corpora could be: - (book_name): Fetch relevant book ID - “corpra:DNov”: All books in that corpora - “author:Thomas Hardy” All books by that author