Skip to content

Instantly share code, notes, and snippets.

@tonyfast
Last active February 25, 2020 04:14
Show Gist options
  • Save tonyfast/2947b4bb582e193f5b2a7dbf8b009b62 to your computer and use it in GitHub Desktop.
Save tonyfast/2947b4bb582e193f5b2a7dbf8b009b62 to your computer and use it in GitHub Desktop.
The pidgy programming about a literate programming that speaks markdown and python. https://mybinder.org/v2/gist/tonyfast/2947b4bb582e193f5b2a7dbf8b009b62/master?filepath=index.ipynb
import pidgy
with pidgy.pidgyLoader():
try:
from . import intro
except:
import intro
with pidgy.pidgyLoader(main=__name__ == "__main__"):
try:
from . import readme
except:
import readme

pidgy programming

Abstract

[Literate Programming] is a literary style that treats documents as having dual qualities of literature and computer programs. The original 1979 implementation defined the [WEB] metalanguage of [Latex] and [Pascal]. pidgy is modern and interactive take on [Literate Programming] that uses Markdown and Python as the respective document and programming languages, of course we'll add some other bits and bobs.

The result of the pidgy implementation is an interactive programming experience where authors design and program simultaneously in Markdown. An effective literate programming will use machine logic to supplement human logic to explain a program program. If the document is a valid module (ie. it can restart and run all), the literate programs can be imported as Python modules then used as terminal applications, web applications, formal testing object, or APIs. All the while, the program itself is a readable work of literature as html, pdf.

pidgy is written as a literate program using Markdown and Python. Throughout this document we'll discuss the applications and methods behind the pidgy and what it takes to implement a [Literate Programming] interface in IPython.

Topics

  • Literate Programming
  • Computational Notebooks
  • Markdown
  • Python
  • Jupyter
  • IPython

Author

Tony Fast

Best practices for literate programming

The first obligation of the literate programmer, defined by Donald Knuth(ie. the prophet of Literate Programming), is a core moral commitment to write literate programs, because:

...; surely nobody wants to admit writing an illiterate program.

The following best practices for literate programming have emerged while desiging pidgy.

List of best practices

  • Restart and run all or it didn't happen.

    A document should be literate in all readable, reproducible, and reusable contexts.

  • When in doubt, abide Web Content Accessibility Guidelines so that information can be accessed by differently abled audiences.

  • Markdown documents are sufficient for single units of thought.

    Markdown documents that translate to python can encode literate programs in a form that is better if version control systems that the json format that encodes notebooks.

  • All code should compute.

    Testing code in a narrative provides supplemental meaning to the "code" signifiers. They provide a test of veracity at least for the computational literacy.

  • readme.md is a good default name for a program.

    Eventually authors will compose ["readme.md"] documents that act as both the "__init__" method and "__main__" methods of the program.

  • Each document should stand alone, despite all possibilities to fall.

  • Use code, data, and visualization to fill the voids of natural language.

  • Find pleasure in writing.

  • When writing narrative include one unit of meaning per line.

    A sentence represents the maximum unit that can be broken up into smaller diffable units. This approach will create cleaner histories in revision control systems.

Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"toc-hr-collapsed": false
},
"source": [
"# `pidgy` programming\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"<!--\n",
" \n",
" import pidgy, pathlib, __init__ as paper, nbconvert, best_practices\n",
"\n",
" load = lambda x, level=1: demote(pathlib.Path(x.__file__).read_text(), level)\n",
" demote = lambda x, i: ''.join(\n",
" '#'*i + x if x.startswith('#') else x for x in x.splitlines(True)\n",
" )\n",
"\n",
" def load(x, level=1):\n",
" return demote(\n",
" pathlib.Path(x.__file__).read_text()\n",
" if x.__file__.endswith('.md')\n",
" else nbconvert.get_exporter('markdown')(exclude_input=True).from_filename(x.__file__)[0], level) \n",
" \n",
" \n",
"\n",
" \n",
" with pidgy.pidgyLoader():\n",
" import pidgy.pytest_config.readme, pidgy.tests.test_pidgin_syntax\n",
"\n",
"-->"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
" \n",
" import pidgy, pathlib, __init__ as paper, nbconvert, best_practices\n",
"\n",
" load = lambda x, level=1: demote(pathlib.Path(x.__file__).read_text(), level)\n",
" demote = lambda x, i: ''.join(\n",
" '#'*i + x if x.startswith('#') else x for x in x.splitlines(True)\n",
" )\n",
"\n",
" def load(x, level=1):\n",
" return demote(\n",
" pathlib.Path(x.__file__).read_text()\n",
" if x.__file__.endswith('.md')\n",
" else nbconvert.get_exporter('markdown')(exclude_input=True).from_filename(x.__file__)[0], level) \n",
" \n",
" \n",
"\n",
" \n",
" with pidgy.pidgyLoader():\n",
" import pidgy.pytest_config.readme, pidgy.tests.test_pidgin_syntax"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"toc-hr-collapsed": false
},
"outputs": [
{
"data": {
"text/markdown": [
"## Abstract\n",
"\n",
"[Literate Programming] is a literary style that treats documents\n",
"as having dual qualities of literature and computer programs.\n",
"The original 1979 implementation defined the [WEB] metalanguage\n",
"of [Latex] and [Pascal]. `pidgy` is modern and interactive\n",
"take on [Literate Programming] that uses [Markdown] and [Python] \n",
"as the respective document and programming languages,\n",
"of course we'll add some other bits and bobs.\n",
"\n",
"The result of the `pidgy` implementation is an interactive programming\n",
"experience where authors design and program simultaneously in [Markdown].\n",
"An effective literate programming will use machine logic to supplement\n",
"human logic to explain a program program.\n",
"If the document is a valid module (ie. it can restart and run all),\n",
"the literate programs can be imported as [Python] modules\n",
"then used as terminal applications, web applications, \n",
"formal testing object, or APIs. All the while, the program \n",
"itself is a readable work of literature as html, pdf.\n",
"\n",
"`pidgy` is written as a literate program using [Markdown]\n",
"and [Python].\n",
"Throughout this document we'll discuss\n",
"the applications and methods behind the `pidgy`\n",
"and what it takes to implement a [Literate Programming]\n",
"interface in `IPython`.\n",
"\n",
"## Topics\n",
"\n",
"- Literate Programming\n",
"- Computational Notebooks\n",
"- Markdown\n",
"- Python\n",
"- Jupyter\n",
"- IPython\n",
"\n",
"## Author\n",
"\n",
"[Tony Fast]\n",
"\n",
"<!--\n",
"\n",
" import __init__ as paper\n",
" import nbconvert, pathlib, click\n",
" file = pathlib.Path(locals().get('__file__', 'readme.md')).parent / 'index.ipynb'\n",
"\n",
" @click.group() \n",
" def application(): ...\n",
"\n",
" @application.command()\n",
" def build():\n",
" to = file.with_suffix('.html')\n",
" to.write_text(\n",
" nbconvert.get_exporter('html')(\n",
" exclude_input=True).from_filename(\n",
" str(file))[0])\n",
" click.echo(F'Built {to}')\n",
" import subprocess\n",
" \n",
" \n",
" @application.command()\n",
" @click.argument('files', nargs=-1)\n",
" def push(files):\n",
" click.echo(__import__('subprocess').check_output(\n",
" F\"gist -u 2947b4bb582e193f5b2a7dbf8b009b62\".split() + list(files)))\n",
"\n",
" if __name__ == '__main__':\n",
" application() if '__file__' in locals() else application.callback()\n",
"\n",
"\n",
"-->\n",
"\n",
"[tony fast]: #\n",
"[markdown]: #\n",
"[python]: #\n",
"[jupyter]: #\n",
"[ipython]: #\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(paper.readme)}}"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"toc-hr-collapsed": false
},
"outputs": [
{
"data": {
"text/markdown": [
"## Best practices for literate programming\n",
"\n",
"The first obligation of the literate programmer, defined by [Donald Knuth](ie.\n",
"the prophet of _[Literate Programming]_), is a core moral commitment to write\n",
"literate programs, because:\n",
"\n",
"> ...; surely nobody wants to admit writing an illiterate program.\n",
">\n",
"> > - [Donald Knuth] _[Literate Programming]_\n",
"\n",
"The following best practices for literate programming have emerged while\n",
"desiging `pidgy`.\n",
"\n",
"### List of best practices\n",
"\n",
"- Restart and run all or it didn't happen.\n",
"\n",
" A document should be literate in all readable, reproducible, and reusable\n",
" contexts.\n",
"\n",
"- When in doubt, abide [Web Content Accessibility Guidelines][wcag] so that\n",
" information can be accessed by differently abled audiences.\n",
"\n",
"- [Markdown] documents are sufficient for single units of thought.\n",
"\n",
" Markdown documents that translate to python can encode literate programs in a\n",
" form that is better if version control systems that the `json` format that\n",
" encodes notebooks.\n",
"\n",
"- All code should compute.\n",
"\n",
" Testing code in a narrative provides supplemental meaning to the `\"code\"`\n",
" signifiers. They provide a test of veracity at least for the computational\n",
" literacy.\n",
"\n",
"- [`readme.md`] is a good default name for a program.\n",
"\n",
" Eventually authors will compose [`\"readme.md\"`] documents that act as both the\n",
" `\"__init__\"` method and `\"__main__\"` methods of the program.\n",
"\n",
"- Each document should stand alone,\n",
" [despite all possibilities to fall.](http://ing.univaq.it/continenza/Corso%20di%20Disegno%20dell'Architettura%202/TESTI%20D'AUTORE/Paul-klee-Pedagogical-Sketchbook.pdf#page=6)\n",
"- Use code, data, and visualization to fill the voids of natural language.\n",
"- Find pleasure in writing.\n",
"- When writing narrative include one unit of meaning per line.\n",
"\n",
" A sentence represents the maximum unit that can be broken up into smaller\n",
" diffable units. This approach will create cleaner histories in revision\n",
" control systems.\n",
"\n",
"[wcag]: https://www.w3.org/WAI/standards-guidelines/wcag/\n",
"[donald knuth]: #\n",
"[literate programming]: #\n",
"[markdown]: #\n",
"[`readme.md`]: #\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(best_practices)}}"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"---\n",
"tangle_weave_diagram: https://user-images.githubusercontent.com/4236275/75093868-bdb12e80-557d-11ea-8989-efd6a733a8e0.png\n",
"\n",
"---\n",
"> I believe that the time is ripe for significantly better documentation of\n",
"> programs, and that we can best achieve this by considering programs to be\n",
"> works of literature.\n",
">> [Donald Knuth]\n",
"\n",
"<!--The introduction should be written as a stand alone essay.-->\n",
"<!--\n",
"\n",
" import figures\n",
"\n",
"-->\n",
"\n",
"## Introduction\n",
"\n",
" \n",
"\n",
"\n",
"\"[Literate programming]\" is a paper published by [Donald Knuth] in 1979. It\n",
"describes a multiobjective, multilingual style of programming that treats programs\n",
"primarily as documentation. Literate programs have measures along two dimensions:\n",
"\n",
"1. the literary qualities determined the document formatting language.\n",
"2. the computational qualities determined by the programming language.\n",
"\n",
"The multilingual nature of literate program creates the opportunity\n",
"for programmers and non-programmers to contribute to the same literature.\n",
"\n",
"Literate programs accept `\"code\"` as an integral part of the narrative.\n",
"`\"code\"` signs can be used in places where language lacks just as figures and equations are used in scientific literature.\n",
"An advantage of `\"code\"` is that it can provide augmented representations\n",
"of documents and their symbols that are tactile and interactive.\n",
"\n",
"![Tangle Weave Diagram]({{tangle_weave_diagram}})\n",
"\n",
"The literate program concurrently describes a program and literature.\n",
"Within the document, natural language and the programming language interact\n",
"through two different process:\n",
"\n",
"1. the tangle process that converts to the programming language.\n",
"2. the weave process that converts to the document formatting language.\n",
"\n",
"The original WEB literate programming implementation chose to tangle to Pascal and weave to Tex. `pidgy`'s modern take on literate programming tangles to [Python] and weaves to [Markdown], and they can be written in either [Markdown] files or `jupyter` `notebooks`.\n",
"\n",
"[Pascal] was originally chosen for its widespread use throughout education,\n",
"and the same can be said for the choice of `jupyter` `notebook`s used\n",
"for education in many programming languages, but most commonly [Python].\n",
"The preferred document language for the `notebook` is [Markdown]\n",
"considering it is part of the notebook schema.\n",
"CP4E\n",
"The motivations made the natural choice for a [Markdown] and [Python]\n",
"programming lanuage.\n",
"Some advantages of this hybrid are that Python is idiomatic and\n",
"sometimes the narrative may be explicitly executable.\n",
"\n",
"[Literate Programming] is alive in places like [Org mode for Emacs], [RMarkdown], [Pweave], [Doctest], or [Literate Coffeescript].\n",
"A conventional look at literate programming will place a focus on the final document. `pidgy` meanwhile places a focus on the interactive literate computing steps required achieve a quality document.\n",
"\n",
"Originally, `pidgy` was designed specifically for the `notebook` file format, but it failed a constraint \n",
"of not being an existing file.\n",
"Now `pidgy` is native for [Markdown] files, and valid testing units.\n",
"It turns out the [Markdown] documents can provide \n",
"a most compact representation of literate program,\n",
"relative to a notebook. And it diffs better.\n",
"\n",
"Design constraints:\n",
"* Use an existing file formats.\n",
"* Minimal bespoke syntax.\n",
"* Importable and testable\n",
"\n",
"A last take on this work is to affirm the reproducibly of enthusiasm when writing literate programs. \n",
"\n",
"\n",
"\n",
"The outcome of writing `pidgy` programs are readable, reusable, and reproducible\n",
"documents. \n",
"`pidgy` natively supports importing markdown and notebooks as source code.\n",
"\n",
"Modern computing has different pieces of software infrastructure than were\n",
"available\n",
"\n",
"[literate programming]: #\n",
"[donald knuth]: #\n",
"[literate coffeescript]: #\n",
"[org mode for emacs]: #\n",
"[jupyter notebooks]: #\n",
"[rmarkdown]: #\n",
"[doctest]: #\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(paper.intro)}}"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"toc-hr-collapsed": false
},
"outputs": [
{
"data": {
"text/markdown": [
"## The `pidgy` extension for programming in Markdown\n",
"\n",
"The `IPython.InteractiveShell` has a configuration system for changing how\n",
"`\"code\"` interacts with the read-eval-print-loop (ie. REPL). `pidgy` uses this\n",
"system to provide a `markdown`-forward REPL interface that can be used with\n",
"`jupyter` tools.\n",
"\n",
"<!--\n",
"\n",
" import jupyter, notebook, IPython, mistune as markdown, IPython as python, ast, jinja2 as template, importnb, doctest, pathlib\n",
" with importnb.Notebook(lazy=True):\n",
" try: from . import loader, tangle, extras\n",
" except: import loader, tangle, extras\n",
" with loader.pidgyLoader(lazy=True):\n",
" try: from . import weave, testing, metadata\n",
" except: import weave, testing, metadata\n",
"-->\n",
"\n",
" def load_ipython_extension(shell: IPython.InteractiveShell) -> None:\n",
"\n",
"The `load_ipython_extension` makes it possible to configure and extend the\n",
"`IPython.InteractiveShell`.\n",
"\n",
" loader.load_ipython_extension(shell)\n",
" tangle.load_ipython_extension(shell)\n",
" extras.load_ipython_extension(shell)\n",
" metadata.load_ipython_extension(shell)\n",
" testing.load_ipython_extension(shell)\n",
" weave.load_ipython_extension(shell)\n",
" ...\n",
"\n",
"1. The `loader` makes it possible to import other markdown documents and\n",
" notebooks as we would with any other [Python] module. The rub is that\n",
" the source code in the program must **Restart and Run All**.\n",
"2. The `tangle` module constructes a line-for-line transformer that\n",
" converts markdown to python.\n",
"3. `pidgy` documents can be used as unit tests. To assist in successful\n",
" tests `pidgy` includes interactive `testing` with each execution. It\n",
" verifies inline code, doctests, test functions, and\n",
" `unittest.TestCase`s.\n",
"4. The `weave` step relies on the `IPython` rich display to show markdown.\n",
" And `jinja` templates.\n",
"\n",
"<!--\n",
"\n",
" def unload_ipython_extension(shell):\n",
"\n",
"`unload_ipython_extension` unloads all the extensions loads in `load_ipython_extension`.\n",
"\n",
" for x in (weave, testing, extras, metadata, tangle):\n",
" x.unload_ipython_extension(shell)\n",
"\n",
"-->\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(pidgy.extension)}}"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"### Events along the `IPython` execution process.\n",
"\n",
"<!--\n",
"\n",
" import datetime, dataclasses, sys, IPython as python, IPython, nbconvert as export, collections, IPython as python, mistune as markdown, hashlib, functools, hashlib, jinja2.meta, ast\n",
" exporter, shell = export.exporters.TemplateExporter(), python.get_ipython()\n",
" modules = lambda:[x for x in sys.modules if '.' not in x and not str.startswith(x,'_')]\n",
"\n",
"-->\n",
"\n",
"pidgin programming is an incremental approach to documents.\n",
"\n",
" @dataclasses.dataclass\n",
" class Events:\n",
"\n",
"The `Events` class is a configurable `dataclasses` object that simplifies\n",
"configuring code execution and metadata collection during interactive computing\n",
"sessions.\n",
"\n",
" shell: IPython.InteractiveShell = dataclasses.field(default_factory=IPython.get_ipython)\n",
" _events = \"pre_execute pre_run_cell post_execute post_run_cell\".split()\n",
" def register(self, shell=None, *, method=''):\n",
" shell = shell or self.shell\n",
"\n",
"A DRY method to `\"register/unregister\" kernel and shell extension objects.\n",
"\n",
" for event in self._events:\n",
" callable = getattr(self, event, None)\n",
" callable and getattr(shell.events, F'{method}register')(event, callable)\n",
" if isinstance(self, ast.NodeTransformer):\n",
" if method:\n",
" self.shell.ast_transformers.pop(self.shell.ast_transformers.index(self))\n",
" else:\n",
" self.shell.ast_transformers.append(self)\n",
" if hasattr(self, 'line_transformers'):\n",
" if method:\n",
" self.shell.line_transformers = [\n",
" x for x in self.shell.line_transformers if x not in self.line_transformers\n",
" ]\n",
" else:\n",
" self.shell.line_transformers.extend(self.line_transformers)\n",
" return self\n",
"\n",
" unregister = functools.partialmethod(register, method='un')\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(pidgy.events, 2)}}"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"## A description of the `pidgy` metalanguage\n",
"\n",
"\n",
"### Everything is markdown\n",
"\n",
"\n",
"\n",
"<!--\n",
" \n",
" import pidgy, jinja2\n",
" Ø = __name__ == '__main__' # Talk about this convention later in the document.\n",
"\n",
"-->\n",
"\n",
"\n",
" You'll need to explicitly print strings to see them formatted as monospace type.\n",
"\n",
"\n",
"\n",
"An important thing to remember about `pidgy` is that all strings default to [Markdown].\n",
"The Python input is translated from markdown input and `...`\n",
"\n",
" \"Even Python strings default to Markdown representations\"\n",
" Ø and print(\"\"\"You'll need to explicitly print strings to see them formatted as monospace type.\"\"\")\n",
"\n",
"\n",
"#### Executing code.\n",
"\n",
"\n",
"There are two ways to define executable `\"code\"` by either __indenting code__ or __code fences w/o a language__.\n",
"\n",
" \"This is code\"\n",
"\n",
"\n",
"```\n",
"\"and so is this\"\n",
"```\n",
"\n",
"```markdown\n",
"but this is not code.\n",
"```\n",
"\n",
"\n",
"\n",
"#### Suppressing output.\n",
"\n",
"\n",
"\n",
"<!--\n",
"\n",
"The output can be suppressed by including a leading a blank line the output.\n",
"\n",
"-->\n",
"\n",
"\n",
"\n",
"<!--\n",
" \n",
" import mistune as markdown, IPython as python, pidgy\n",
"Cells starting with a blank line are not displayed.\n",
"\n",
"-->\n",
"\n",
"\n",
"\n",
"### The `__name__ == '__main__'` pattern.\n",
"\n",
"\n",
"\n",
"### [Markdown] blocks as python objects\n",
"\n",
" >>> a_markdown_block\n",
" 'Lorem ipsum dolor sit amet, ...'\n",
" \n",
"`pidgy` converts `not \"code\"` objects to block strings during the tangling \n",
"step. This allows us to define blocks of [Markdown] as [Python] objects.\n",
"\n",
" a_markdown_block =\\\n",
"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse tincidunt \n",
"convallis nunc quis fringilla. Duis faucibus metus et tellus bibendum vestibulum \n",
"bibendum sed massa. Donec ante augue, ullamcorper ac dictum id, eleifend ac turpis.\n",
"\n",
"\n",
"\n",
"### Interactive formal testing. \n",
"\n",
"`pidgy` recognizes a formal testing discovery on increments of code. It is sandwiched\n",
"in between the tangle and weave phases.\n",
"\n",
" import doctest\n",
"#### `doctest`\n",
"\n",
" >>> assert True\n",
" >>> print\n",
" <built-in function print>\n",
" >>> pidgy\n",
" <module...__init__.py'>\n",
"\n",
"\n",
"\n",
"\n",
"### Using `jinja2` templates to weave `pidgy` outputs.\n",
"\n",
"`jinja2` filters and templates can used within [Markdown] source to \n",
"format the output with values from the program.\n",
"\n",
"{{\"jinja templates accept most python expression syntax\"}}\n",
"\n",
"`jinja2` adds features to the environment like blocks, templates, filters, and macros that\n",
"can be reused in templates.\n",
"\n",
"It is not possible to template code to run. That would be dangerous.\n",
"\n",
"\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(pidgy.tests.test_pidgin_syntax)}}"
]
},
{
"cell_type": "markdown",
"metadata": {
"toc-hr-collapsed": false
},
"source": [
"## Applications"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"### Importing and reusing `pidgy` literature\n",
"\n",
"A constraint consistent across most programming languages is that\n",
"programs are executed line-by-line without any\n",
"statements or expressions. raising exceptions \n",
"If literate programs have the computational quality that they __restart\n",
"and run all__ the they should \n",
"When `pidgy` programs have this quality they can <code>import</code> in [Python], they become importable essays or reports.\n",
"\n",
"<!--\n",
"\n",
"\n",
" __all__ = 'pidgyLoader',\n",
" import pidgy, sys, IPython, mistune as markdown, importnb, IPython as python\n",
" with importnb.Notebook(lazy=True):\n",
" try: from . import tangle, extras\n",
" except: import tangle, extras\n",
" if __name__ == '__main__':\n",
" shell = get_ipython()\n",
"\n",
"\n",
"-->\n",
"\n",
"The `pidgyLoader` customizes [Python]'s ability to discover \n",
"[Markdown] and `pidgy` [Notebook]s have the composite `\".md.ipynb\"` extension.\n",
"`importnb` provides a high level API for modifying how content\n",
"[Python] imports different file types.\n",
"\n",
"`sys.meta_path and sys.path_hooks`\n",
"\n",
"\n",
" class pidgyLoader(importnb.Notebook): \n",
" extensions = \".md .md.ipynb\".split()\n",
"\n",
"\n",
"`get_data` determines how a file is decoding from disk. We use it to make an escape hatch for markdown files otherwise we are importing a notebook.\n",
"\n",
"\n",
" def get_data(self, path):\n",
" if self.path.endswith('.md'):\n",
" return self.code(self.decode())\n",
" return super(pidgyLoader, self).get_data(path)\n",
"\n",
"\n",
"The `code` method tangles the [Markdown] to [Python] before compiling to an [Abstract Syntax Tree].\n",
"\n",
"\n",
" def code(self, str): \n",
" with importnb.Notebook(lazy=True):\n",
" try: from . import tangle\n",
" except: import tangle\n",
" return ''.join(tangle.pidgy.transform_cell(str))\n",
"\n",
"\n",
"The `visit` method allows custom [Abstract Syntax Tree] transformations to be applied.\n",
"\n",
"\n",
" def visit(self, node):\n",
" with importnb.Notebook():\n",
" try: from . import tangle\n",
" except: import tangle\n",
" return tangle.ReturnYield().visit(node)\n",
" \n",
"\n",
"\n",
"Attach these methods to the `pidgy` loader.\n",
"\n",
"\n",
" pidgyLoader.code, pidgyLoader.visit = code, visit\n",
" pidgyLoader.get_source = pidgyLoader.get_data = get_data\n",
"\n",
"\n",
"The `pidgy` `loader` configures how [Python] discovers modules when they are\n",
"imported.\n",
"Usually the loader is used as a content manager and in this case we hold the enter \n",
"the context, but do not leave it until `unload_ipython_extension` is executed.\n",
"\n",
"\n",
" def load_ipython_extension(shell):\n",
" setattr(shell, 'loaders', getattr(shell, 'loaders', {}))\n",
" shell.loaders[pidgyLoader] = pidgyLoader(position=-1, lazy=True)\n",
" shell.loaders[pidgyLoader].__enter__()\n",
"\n",
"\n",
"<!--\n",
"\n",
"-->\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(pidgy.loader, 2)}}"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"### `\"readme.md\"` is a good name for a file.\n",
"\n",
"> [**Eat Me, Drink Me, Read Me.**][readme history]\n",
"\n",
"In `pidgy`, the `\"readme.md\"` is treated as the description and implementation\n",
"of the `__main__` program. The code below outlines the `pidgy` command line\n",
"application to reuse literate `pidgy` documents in `markdown` and `notebook`\n",
"files. It outlines how static `pidgy` documents may be reused outside of the\n",
"interactive context.\n",
"\n",
"<!--excerpt-->\n",
"\n",
" ...\n",
"\n",
"<!--\n",
"\n",
" import click, IPython, pidgy, nbconvert, pathlib, re\n",
"\n",
"-->\n",
"\n",
" @click.group()\n",
" def application()->None:\n",
"\n",
"The `pidgy` `application` will group together a few commands that can view,\n",
"execute, and test pidgy documents.\n",
"\n",
"<!---->\n",
"\n",
"#### `\"pidgy run\"` literature as code\n",
"\n",
" @application.command(context_settings=dict(allow_extra_args=True))\n",
" @click.option('--verbose/--quiet', default=True)\n",
" @click.argument('ref', type=click.STRING)\n",
" @click.pass_context\n",
" def run(ctx, ref, verbose):\n",
"\n",
"`pidgy` `run` makes it possible to execute `pidgy` documents as programs, and\n",
"view their pubished results.\n",
"\n",
" import pidgy, importnb, runpy, sys, importlib, jinja2\n",
" comment = re.compile(r'(?s:<!--.*?-->)')\n",
" absolute = str(pathlib.Path().absolute())\n",
" sys.path = ['.'] + sys.path\n",
" with pidgy.pidgyLoader(main=True), importnb.Notebook(main=True):\n",
" click.echo(F\"Running {ref}.\")\n",
" sys.argv, argv = [ref] + ctx.args, sys.argv\n",
" try:\n",
" if pathlib.Path(ref).exists():\n",
" for ext in \".py .ipynb .md\".split(): ref = ref[:-len(ext)] if ref[-len(ext):] == ext else ref\n",
" if ref in sys.modules:\n",
" with pidgy.pidgyLoader(): # cant reload main\n",
" object = importlib.reload(importlib.import_module(ref))\n",
" else: object = importlib.import_module(ref)\n",
" if verbose:\n",
" md = (nbconvert.get_exporter('markdown')(\n",
" exclude_output=object.__file__.endswith('.md.ipynb')).from_filename(object.__file__)[0]\n",
" if object.__file__.endswith('.ipynb')\n",
" else pathlib.Path(object.__file__).read_text())\n",
" md = re.sub(comment, '', md)\n",
" click.echo(\n",
" jinja2.Template(md).render(vars(object)))\n",
" finally: sys.argv = argv\n",
"\n",
"<!---->\n",
"\n",
"#### Test `pidgy` documents in pytest.\n",
"\n",
" @application.command(context_settings=dict(allow_extra_args=True))\n",
" @click.argument('files', nargs=-1, type=click.STRING)\n",
" @click.pass_context\n",
" def test(ctx, files):\n",
"\n",
"Formally test markdown documents, notebooks, and python files.\n",
"\n",
" import pytest\n",
" pytest.main(ctx.args+['--doctest-modules', '--disable-pytest-warnings']+list(files))\n",
"\n",
"<!---->\n",
"\n",
"#### Install `pidgy` as a known kernel.\n",
"\n",
" @application.group()\n",
" def kernel():\n",
"\n",
"`pidgy` is mainly designed to improve the interactive experience of creating\n",
"literature in computational notebooks.\n",
"\n",
"<!---->\n",
"\n",
" @kernel.command()\n",
" def install(user=False, replace=None, prefix=None):\n",
"\n",
"`install` the pidgy kernel.\n",
"\n",
" manager = __import__('jupyter_client').kernelspec.KernelSpecManager()\n",
" path = str((pathlib.Path(__file__).parent / 'kernelspec').absolute())\n",
" try:\n",
" dest = manager.install_kernel_spec(path, 'pidgy')\n",
" except:\n",
" click.echo(F\"System install was unsuccessful. Attempting to install the pidgy kernel to the user.\")\n",
" dest = manager.install_kernel_spec(path, 'pidgy', True)\n",
" click.echo(F\"The pidgy kernel was install in {dest}\")\n",
"\n",
"<!--\n",
"\n",
" @kernel.command()\n",
" def uninstall(user=True, replace=None, prefix=None):\n",
"\n",
"`uninstall` the kernel.\n",
"\n",
" import jupyter_client\n",
" jupyter_client.kernelspec.KernelSpecManager().remove_kernel_spec('pidgy')\n",
" click.echo(F\"The pidgy kernel was removed.\")\n",
"\n",
"\n",
" @kernel.command()\n",
" @click.option('-f')\n",
" def start(user=True, replace=None, prefix=None, f=None):\n",
"\n",
"Launch a `pidgy` kernel applications.\n",
"\n",
" import ipykernel.kernelapp\n",
" with pidgy.pidgyLoader():\n",
" from . import kernel\n",
" ipykernel.kernelapp.IPKernelApp.launch_instance(\n",
" kernel_class=kernel.pidgyKernel)\n",
" ...\n",
"\n",
"-->\n",
"\n",
"[art of the readme]: https://github.com/noffle/art-of-readme\n",
"[readme history]: https://medium.com/@NSomar/readme-md-history-and-components-a365aff07f10\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(pidgy.readme, 2)}}"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"toc-hr-collapsed": false
},
"outputs": [
{
"data": {
"text/markdown": [
"### Configuring the `pidgy` shell and kernel architecture.\n",
"\n",
"![](https://jupyter.readthedocs.io/en/latest/_images/other_kernels.png)\n",
"\n",
"Interactive programming in `pidgy` documents is accessed using the polyglot\n",
"[Jupyter] kernel architecture. In fact, the provenance the [Jupyter]\n",
"name is a combination the native kernel architectures for\n",
"[ju~~lia~~][julia], [pyt~~hon~~][python], and [r]. [Jupyter]'s\n",
"generalization of the kernel/shell interface allows\n",
"over 100 languages to be used in `notebook and jupyterlab`.\n",
"It is possible to define prescribe wrapper kernels around existing\n",
"methods; this is the appraoach that `pidgy` takes\n",
"\n",
"> A kernel provides programming language support in Jupyter. IPython is the default kernel. Additional kernels include R, Julia, and many more.\n",
">\n",
"> > - [`jupyter` kernel definition](https://jupyter.readthedocs.io/en/latest/glossary.html#term-kernel)\n",
"\n",
"`pidgy` is not not a native kernel. It is a wrapper kernel around the\n",
"existing `ipykernel and IPython.InteractiveShell` configurables.\n",
"`IPython` adds extra syntax to python that simulate literate programming\n",
"macros.\n",
"\n",
"<!--\n",
"\n",
" import jupyter_client, IPython, ipykernel.ipkernel, ipykernel.kernelapp, pidgy, traitlets, pidgy, traitlets, ipykernel.kernelspec, ipykernel.zmqshell, pathlib, traitlets\n",
"\n",
"-->\n",
"\n",
"The shell is the application either jupyterlab or jupyter notebook, the kernel\n",
"determines the programming language. Below we design a just jupyter kernel that\n",
"can be installed using\n",
"\n",
"- What is the advantage of installing the kernel and how to do it.\n",
"\n",
"```bash\n",
"pidgy kernel install\n",
"```\n",
"\n",
"#### Configure the `pidgy` shell.\n",
"\n",
" class pidgyInteractiveShell(ipykernel.zmqshell.ZMQInteractiveShell):\n",
"\n",
"Configure a native `pidgy` `IPython.InteractiveShell`\n",
"\n",
" loaders = traitlets.Dict(allow_none=True)\n",
" weave = traitlets.Any(allow_none=True)\n",
" tangle = ipykernel.zmqshell.ZMQInteractiveShell.input_transformer_manager\n",
" extras = traitlets.Any(allow_none=True)\n",
" testing = traitlets.Any(allow_none=True)\n",
" enable_html_pager = traitlets.Bool(True)\n",
"\n",
"`pidgyInteractiveShell.enable_html_pager` is necessary to see rich displays in\n",
"the inspector.\n",
"\n",
" def __init__(self,*args, **kwargs):\n",
" super().__init__(*args, **kwargs)\n",
" with pidgy.pidgyLoader():\n",
" from .extension import load_ipython_extension\n",
" load_ipython_extension(self)\n",
"\n",
"#### Configure the `pidgy` kernel.\n",
"\n",
" class pidgyKernel(ipykernel.ipkernel.IPythonKernel):\n",
" shell_class = traitlets.Type(pidgyInteractiveShell)\n",
" _last_parent = traitlets.Dict()\n",
"\n",
" def init_metadata(self, parent):\n",
" self._last_parent = parent\n",
" return super().init_metadata(parent)\n",
"\n",
"\n",
" def do_inspect(self, code, cursor_pos, detail_level=0):\n",
"\n",
"<details><summary>Customizing the Jupyter inspector behavior for literate computing</summary><p>\n",
"When we have access to the kernel class it is possible to customize\n",
"a number of interactive shell features. The do inspect function\n",
"adds some features to `jupyter`'s inspection behavior when working in \n",
"`pidgy`.\n",
"</p><pre></code>\n",
"\n",
" object = {'found': False}\n",
" if code[:cursor_pos][-3:] == '!!!':\n",
" object = {'found': True, 'data': {'text/markdown': self.shell.weave.format_markdown(code[:cursor_pos-3]+code[cursor_pos:])}}\n",
" else:\n",
" try:\n",
" object = super().do_inspect(code, cursor_pos, detail_level=0)\n",
" except: ...\n",
"\n",
" if not object['found']:\n",
"\n",
"Simulate finding an object and return a preview of the markdown.\n",
"\n",
" object['found'] = True\n",
" line, offset = IPython.utils.tokenutil.line_at_cursor(code, cursor_pos)\n",
" lead = code[:cursor_pos]\n",
" col = cursor_pos - offset\n",
"\n",
"\n",
" code = F\"\"\"<code>·L{\n",
" len(lead.splitlines()) + int(not(col))\n",
" },C{col + 1}</code><br/>\\n\\n\"\"\" + code[:cursor_pos]+'·'+('' if col else '<br/>\\n')+code[cursor_pos:]\n",
"\n",
" object['data'] = {'text/markdown': code}\n",
"\n",
"We include the line number and cursor position to enrich the connection between\n",
"the inspector and the source code displayed on another part of the screen.\n",
"\n",
" return object\n",
" ...\n",
"\n",
"</details>\n",
"\n",
"#### `pidgy`-like interfaces in other languages.\n",
"\n",
"[julia]: #\n",
"[r]: #\n",
"[python]: #\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(pidgy.kernel, 2)}}"
]
},
{
"cell_type": "markdown",
"metadata": {
"toc-hr-collapsed": false
},
"source": [
"## Methods"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"### Tangling [Markdown] to [Python]\n",
"\n",
"The tangle process in literate programming converts the input document \n",
"into the programming language.\n",
"\n",
"\n",
"<!--\n",
" \n",
" import IPython, typing as τ, mistune as markdown, IPython, importnb as _import_, textwrap, ast, doctest, typing, re\n",
"\n",
"-->\n",
"\n",
"\n",
"The `pidgyTransformer` manages the high level API the `IPython.InteractiveShell` interacts with for `pidgy`.\n",
"\n",
"\n",
" class pidgyTransformer(IPython.core.inputtransformer2.TransformerManager):\n",
" def pidgy_transform(self, cell: str) -> str: \n",
" return self.tokenizer.untokenize(self.tokenizer.parse(''.join(cell)))\n",
" \n",
" def transform_cell(self, cell):\n",
" return super().transform_cell(self.pidgy_transform(cell))\n",
" \n",
" def __init__(self, *args, **kwargs):\n",
" super().__init__(*args, **kwargs)\n",
" self.tokenizer = Tokenizer()\n",
"\n",
" def pidgy_magic(self, *text): \n",
" return IPython.display.Code(self.pidgy_transform(''.join(text)), language='python')\n",
"\n",
"\n",
"#### Tokenizer logic\n",
"\n",
"The tokenizer controls the translation of markdown strings to python strings. Our major constraint is that the Markdown input should retain line numbers.\n",
"\n",
"\n",
" class Tokenizer(markdown.BlockLexer):\n",
" class grammar_class(markdown.BlockGrammar):\n",
" doctest = doctest.DocTestParser._EXAMPLE_RE\n",
" default_rules = \"newline hrule block_code fences heading nptable lheading block_quote list_block def_links def_footnotes table paragraph text\".split()\n",
"\n",
" def parse(self, text: str, default_rules=None) -> typing.List[dict]:\n",
" if not self.depth: self.tokens = []\n",
" with self: tokens = super().parse(whiten(text), default_rules)\n",
" if not self.depth: tokens = self.normalize(text, tokens)\n",
" return tokens\n",
"\n",
" def parse_doctest(self, m): self.tokens.append({'type': 'paragraph', 'text': m.group(0)})\n",
"\n",
" def parse_fences(self, m):\n",
" if m.group(2): self.tokens.append({'type': 'paragraph', 'text': m.group(0)})\n",
" else: super().parse_fences(m)\n",
"\n",
" def parse_hrule(self, m):\n",
" self.tokens.append({'type': 'hrule', 'text': m.group(0)})\n",
"\n",
" def normalize(self, text, tokens):\n",
" \"\"\"Combine non-code tokens into contiguous blocks.\"\"\"\n",
" compacted = []\n",
" while tokens:\n",
" token = tokens.pop(0)\n",
" if 'text' not in token: continue\n",
" else: \n",
" if not token['text'].strip(): continue\n",
" block, body = token['text'].splitlines(), \"\"\n",
" while block:\n",
" line = block.pop(0)\n",
" if line:\n",
" before, line, text = text.partition(line)\n",
" body += before + line\n",
" if token['type']=='code':\n",
" compacted.append({'type': 'code', 'lang': None, 'text': body})\n",
" else:\n",
" if compacted and compacted[-1]['type'] == 'paragraph':\n",
" compacted[-1]['text'] += body\n",
" else: compacted.append({'type': 'paragraph', 'text': body})\n",
" if compacted and compacted[-1]['type'] == 'paragraph':\n",
" compacted[-1]['text'] += text\n",
" elif text.strip():\n",
" compacted.append({'type': 'paragraph', 'text': text})\n",
" # Deal with front matter\n",
" if compacted[0]['text'].startswith('---\\n') and '\\n---' in compacted[0]['text'][4:]:\n",
" token = compacted.pop(0)\n",
" front_matter, sep, paragraph = token['text'][4:].partition('---')\n",
" compacted = [{'type': 'front_matter', 'text': F\"\\n{front_matter}\"},\n",
" {'type': 'paragraph', 'text': paragraph}] + compacted\n",
" return compacted\n",
"\n",
" depth = 0\n",
" def __enter__(self): self.depth += 1\n",
" def __exit__(self, *e): self.depth -= 1\n",
"\n",
" def untokenize(self, tokens: τ.List[dict], source: str = \"\"\"\"\"\", last: int =0) -> str:\n",
" INDENT = indent = base_indent(tokens) or 4\n",
" for i, token in enumerate(tokens):\n",
" object = token['text']\n",
" if token and token['type'] == 'code':\n",
" if object.lstrip().startswith(FENCE):\n",
"\n",
" object = ''.join(''.join(object.partition(FENCE)[::2]).rpartition(FENCE)[::2])\n",
" indent = INDENT + num_first_indent(object)\n",
" object = textwrap.indent(object, INDENT*SPACE)\n",
"\n",
" if object.lstrip().startswith(MAGIC): ...\n",
" else: indent = num_last_indent(object)\n",
" elif token and token['type'] == 'front_matter': \n",
" object = textwrap.indent(\n",
" F\"locals().update(__import__('yaml').safe_load({quote(object)}))\\n\", indent*SPACE)\n",
"\n",
" elif not object: ...\n",
" else:\n",
" object = textwrap.indent(object, indent*SPACE)\n",
" for next in tokens[i+1:]:\n",
" if next['type'] == 'code':\n",
" next = num_first_indent(next['text'])\n",
" break\n",
" else: next = indent \n",
" Δ = max(next-indent, 0)\n",
"\n",
" if not Δ and source.rstrip().rstrip(CONTINUATION).endswith(COLON): \n",
" Δ += 4\n",
"\n",
" spaces = num_whitespace(object)\n",
" \"what if the spaces are ling enough\"\n",
" object = object[:spaces] + Δ*SPACE+ object[spaces:]\n",
" if not source.rstrip().rstrip(CONTINUATION).endswith(QUOTES): \n",
" object = quote(object)\n",
" source += object\n",
"\n",
" for token in reversed(tokens):\n",
" if token['text'].strip():\n",
" if token['type'] != 'code': \n",
" source = source.rstrip() + SEMI\n",
" break\n",
"\n",
" return source\n",
" \n",
" for x in \"default_rules footnote_rules list_rules\".split():\n",
" setattr(Tokenizer, x, list(getattr(Tokenizer, x)))\n",
" getattr(Tokenizer, x).insert(getattr(Tokenizer, x).index('block_code'), 'doctest')\n",
" if 'block_html' in getattr(Tokenizer, x):\n",
" getattr(Tokenizer, x).pop(getattr(Tokenizer, x).index('block_html'))\n",
" \n",
" pidgy = pidgyTransformer()\n",
"\n",
"\n",
"\n",
"<!--\n",
" \n",
" # This has to be in a separate cell because the tests go crazy.\n",
" \n",
" (FENCE, CONTINUATION, SEMI, COLON, MAGIC, DOCTEST), QUOTES, SPACE ='``` \\\\ ; : %% >>>'.split(), ('\"\"\"', \"'''\"), ' '\n",
" \n",
"\n",
"-->\n",
"\n",
"\n",
"\n",
"<!--\n",
" \n",
" WHITESPACE = re.compile('^\\s*', re.MULTILINE)\n",
"\n",
" def num_first_indent(text):\n",
" for str in text.splitlines():\n",
" if str.strip(): return len(str) - len(str.lstrip())\n",
" return 0\n",
" \n",
" def num_last_indent(text):\n",
" for str in reversed(text.splitlines()):\n",
" if str.strip(): return len(str) - len(str.lstrip())\n",
" return 0\n",
"\n",
" def base_indent(tokens):\n",
" \"Look ahead for the base indent.\"\n",
" for i, token in enumerate(tokens):\n",
" if token['type'] == 'code':\n",
" code = token['text']\n",
" if code.lstrip().startswith(FENCE): continue\n",
" indent = num_first_indent(code)\n",
" break\n",
" else: indent = 4\n",
" return indent\n",
"\n",
" def quote(text):\n",
" \"\"\"wrap text in `QUOTES`\"\"\"\n",
" if text.strip():\n",
" left, right = len(text)-len(text.lstrip()), len(text.rstrip())\n",
" quote = QUOTES[(text[right-1] in QUOTES[0]) or (QUOTES[0] in text)]\n",
" return text[:left] + quote + text[left:right] + quote + text[right:]\n",
" return text \n",
"\n",
" def num_whitespace(text): return len(text) - len(text.lstrip())\n",
" \n",
" def whiten(text: str) -> str:\n",
" \"\"\"`whiten` strips empty lines because the `markdown.BlockLexer` doesn't like that.\"\"\"\n",
" return '\\n'.join(x.rstrip() for x in text.splitlines())\n",
"\n",
"-->\n",
"\n",
"\n",
"\n",
"<!--\n",
" \n",
" def load_ipython_extension(shell):\n",
" shell.tangle = pidgy_transformer = pidgyTransformer() \n",
" shell.input_transformer_manager = pidgy_transformer\n",
" if not any(x for x in shell.ast_transformers if isinstance(x, ReturnYield)):\n",
" shell.ast_transformers.append(ReturnYield())\n",
" \n",
" def unload_ipython_extension(shell):\n",
" shell.input_transformer_manager = __import__('IPython').core.inputtransformer2.TransformerManager()\n",
" shell.ast_transformers = [x for x in shell.ast_transformers if not isinstance(x, ReturnYield)]\n",
"\n",
"-->\n",
"\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(pidgy.tangle, 2)}}"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"#### Extra langauge features of `pidgy`\n",
"\n",
"<!--\n",
"\n",
"\n",
" import IPython, typing as τ, mistune as markdown, IPython, importnb as _import_, textwrap, ast, doctest, typing, re\n",
" import dataclasses, ast, pidgy\n",
" with pidgy.pidgyLoader(lazy=True):\n",
" try: from . import events\n",
" except: import events\n",
"\n",
"\n",
"-->\n",
"\n",
"##### naming variables with gestures.\n",
"\n",
"We know naming is hard, there is no point focusing on it. `pidgy` allows authors\n",
"to use emojis as variables in python. They add extra color and expression to the narrative.\n",
"\n",
"\n",
" def demojize(lines, delimiters=('_', '_')):\n",
" str = ''.join(lines)\n",
" import tokenize, emoji, stringcase; tokens = []\n",
" try:\n",
" for token in list(tokenize.tokenize(\n",
" __import__('io').BytesIO(str.encode()).readline)):\n",
" if token.type == tokenize.ERRORTOKEN:\n",
" string = emoji.demojize(token.string, delimiters=delimiters\n",
" ).replace('-', '_').replace(\"’\", \"_\")\n",
" if tokens and tokens[-1].type == tokenize.NAME: tokens[-1] = tokenize.TokenInfo(tokens[-1].type, tokens[-1].string + string, tokens[-1].start, tokens[-1].end, tokens[-1].line)\n",
" else: tokens.append(\n",
" tokenize.TokenInfo(\n",
" tokenize.NAME, string, token.start, token.end, token.line))\n",
" else: tokens.append(token)\n",
" return tokenize.untokenize(tokens).decode().splitlines(True)\n",
" except BaseException: raise SyntaxError(str)\n",
"\n",
"\n",
"##### Top level return and yield statements.\n",
"\n",
"<!--\n",
"\n",
"\n",
" def unload_ipython_extension(shell):\n",
" shell.extras.unregister()\n",
"\n",
"\n",
"-->\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(pidgy.extras, 3)}}"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"toc-hr-collapsed": false
},
"outputs": [
{
"data": {
"text/markdown": [
"### Weaving cells in pidgin programs\n",
"\n",
"<!--\n",
"\n",
" import datetime, dataclasses, sys, IPython as python, IPython, nbconvert as export, collections, IPython as python, mistune as markdown, hashlib, functools, hashlib, jinja2.meta, pidgy\n",
" exporter, shell = export.exporters.TemplateExporter(), python.get_ipython()\n",
" modules = lambda:[x for x in sys.modules if '.' not in x and not str.startswith(x,'_')]\n",
" with pidgy.pidgyLoader(lazy=True):\n",
" try:\n",
" from . import events\n",
" except:\n",
" import events\n",
"\n",
"\n",
"-->\n",
"\n",
"pidgin programming is an incremental approach to documents.\n",
"\n",
" def load_ipython_extension(shell):\n",
" shell.display_formatter.formatters['text/markdown'].for_type(str, lambda x: x)\n",
" shell.weave = Weave(shell=shell)\n",
" shell.weave.register()\n",
"\n",
" @dataclasses.dataclass\n",
" class Weave(events.Events):\n",
" shell: IPython.InteractiveShell = dataclasses.field(default_factory=IPython.get_ipython)\n",
" environment: jinja2.Environment = dataclasses.field(default=exporter.environment)\n",
" _null_environment = jinja2.Environment()\n",
"\n",
" def format_markdown(self, text):\n",
" try:\n",
" template = exporter.environment.from_string(text, globals=getattr(self.shell, 'user_ns', {}))\n",
" text = template.render()\n",
" except BaseException as Exception:\n",
" self.shell.showtraceback((type(Exception), Exception, Exception.__traceback__))\n",
" return text\n",
"\n",
" def format_metadata(self):\n",
" parent = getattr(self.shell.kernel, '_last_parent', {})\n",
" return {}\n",
"\n",
" def _update_filters(self):\n",
" self.environment.filters.update({\n",
" k: v for k, v in getattr(self.shell, 'user_ns', {}).items() if callable(v) and k not in self.environment.filters})\n",
"\n",
"\n",
" def post_run_cell(self, result):\n",
" text = strip_front_matter(result.info.raw_cell)\n",
" lines = text.splitlines() or ['']\n",
" IPython.display.display(IPython.display.Markdown(\n",
" self.format_markdown(text) if lines[0].strip() else F\"\"\"<!--\\n{text}\\n\\n-->\"\"\", metadata=self.format_metadata())\n",
" )\n",
" return result\n",
"\n",
" def unload_ipython_extension(shell):\n",
" try:\n",
" shell.weave.unregister()\n",
" except:...\n",
"\n",
" def strip_front_matter(text):\n",
" if text.startswith('---\\n'):\n",
" front_matter, sep, rest = text[4:].partition(\"\\n---\")\n",
" if sep: return ''.join(rest.splitlines(True)[1:])\n",
" return text\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(pidgy.weave, 2)}}"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"### Interactive testing of literate programs\n",
"\n",
"A primary use case of notebooks is to test ideas. Typically this in informally using\n",
"manual validation to qualify the efficacy of narrative and code. To ensure testable literate documents\n",
"we formally test code incrementally during interactive computing.\n",
"\n",
" import unittest, doctest, textwrap, dataclasses, IPython, re, pidgy, sys, typing, types, contextlib, ast, inspect\n",
" with pidgy.pidgyLoader(lazy=True):\n",
" try: from . import events\n",
" except: import events\n",
"\n",
" def make_test_suite(*objects: typing.Union[\n",
" unittest.TestCase, types.FunctionType, str\n",
" ], vars, name) -> unittest.TestSuite:\n",
"\n",
"The interactive testing suite execute `doctest and unittest` conventions\n",
"for a flexible interface to verifying the computational qualities of literate programs.\n",
"\n",
" suite, doctest_suite = unittest.TestSuite(), doctest.DocTestSuite()\n",
" suite.addTest(doctest_suite)\n",
" for object in objects:\n",
" if isinstance(object, type) and issubclass(object, unittest.TestCase):\n",
" suite.addTest(unittest.defaultTestLoader.loadTestsFromTestCase(object))\n",
" elif isinstance(object, str):\n",
" doctest_suite.addTest(doctest.DocTestCase(\n",
" doctest.DocTestParser().get_doctest(object, vars, name, name, 0)))\n",
" doctest_suite.addTest(doctest.DocTestCase(\n",
" InlineDoctestParser().get_doctest(object, vars, name, name, 0), checker=NullOutputCheck))\n",
" elif inspect.isfunction(object):\n",
" suite.addTest(unittest.FunctionTestCase(object))\n",
" return suite\n",
"\n",
" @dataclasses.dataclass\n",
" class Testing(events.Events):\n",
"\n",
"The `Testing` class executes the test suite each time a cell is executed.\n",
"\n",
" function_pattern: str = 'test*'\n",
" def post_run_cell(self, result):\n",
" globs, filename = self.shell.user_ns, F\"In[{self.shell.last_execution_result.execution_count}]\"\n",
"\n",
" with ipython_compiler(self.shell):\n",
" definitions = [self.shell.user_ns[x] for x in self.shell.metadata.definitions\n",
" if x.startswith(self.function_pattern) or\n",
" isinstance(self.shell.user_ns[x], type) and issubclass(self.shell.user_ns[x], unittest.TestCase)\n",
" ]\n",
" result = self.run(make_test_suite(result.info.raw_cell, *definitions, vars=self.shell.user_ns, name=filename))\n",
"\n",
"\n",
" def run(self, suite: unittest.TestCase) -> unittest.TestResult:\n",
" result = unittest.TestResult(); suite.run(result)\n",
" if result.failures:\n",
" sys.stderr.writelines((str(result) + '\\n' + '\\n'.join(msg for text, msg in result.failures)).splitlines(True))\n",
" return result\n",
"\n",
" @contextlib.contextmanager\n",
" def ipython_compiler(shell):\n",
"\n",
"We'll have to replace how `doctest` compiles code with the `IPython` machinery.\n",
"\n",
" def compiler(input, filename, symbol, *args, **kwargs):\n",
" nonlocal shell\n",
" return shell.compile(\n",
" ast.Interactive(\n",
" body=shell.transform_ast(\n",
" shell.compile.ast_parse(shell.transform_cell(textwrap.indent(input, ' '*4)))\n",
" ).body),\n",
" F\"In[{shell.last_execution_result.execution_count}]\",\n",
" \"single\",\n",
" )\n",
"\n",
" yield setattr(doctest, \"compile\", compiler)\n",
" doctest.compile = compile\n",
"\n",
" class NullOutputCheck(doctest.OutputChecker):\n",
" def check_output(self, *e): return True\n",
"\n",
" class InlineDoctestParser(doctest.DocTestParser):\n",
" _EXAMPLE_RE = re.compile(r'`(?P<indent>\\s{0})'\n",
" r'(?P<source>[^`].*?)'\n",
" r'`')\n",
" def _parse_example(self, m, name, lineno): return m.group('source'), None, \"...\", None\n",
"\n",
"\n",
" def load_ipython_extension(shell):\n",
" shell.testing = Testing(shell=shell).register()\n",
"\n",
" def unload_ipython_extension(shell):\n",
" shell.testing.unregister()\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(pidgy.testing, 2)}}"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"#### Literature as the test\n",
"\n",
" import pidgy, pytest, nbval, doctest, importnb.utils.pytest_importnb\n",
" if __name__ == '__main__':\n",
" import notebook, IPython as python\n",
"\n",
"Intertextuallity emerges when the primary target of a program is literature.\n",
"Some of the literary content may include `\"code\"` `object`s that can be tested\n",
"to qualify the veracity of these dual signifiers.\n",
"\n",
"`pidgy` documents are designed to be tested under multiple formal testing\n",
"conditions. This is motivated by the `python`ic concept of documentation\n",
"testing, or `doctest`ing, which in itself is a literate programming style. A\n",
"`pidgy` document includes `doctest`, it verifies `notebook` `input`/`\"output\"`,\n",
"and any formally defined tests are collected.\n",
"\n",
" class pidgyModule(importnb.utils.pytest_importnb.NotebookModule):\n",
"\n",
"`pidgy` provides a `pytest` plugin that works only on `\".md.ipynb\"` files. The\n",
"`pidgy.kernel` works directly with `nbval`, install the python packkage and use\n",
"the --nbval flag. `pidgy` uses features from `importnb` to support standard\n",
"tests discovery, and `doctest` discovery across all strings. Still working on\n",
"coverage. The `pidgyModule` permits standard test discovery in notebooks.\n",
"Functions beginning with `\"test_\"` indicate test functions.\n",
"\n",
" loader = pidgy.pidgyLoader\n",
"\n",
" class pidgyTests(importnb.utils.pytest_importnb.NotebookTests):\n",
"\n",
"if `pidgy` is install then importnb is.\n",
"\n",
" modules = pidgyModule,\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(pidgy.pytest_config.readme, 3)}}"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"### Capturing metadata during the interactive compute process\n",
"\n",
"To an organization, human compute time bears an important cost\n",
"and programming represents a small part of that cycle.\n",
"\n",
" def load_ipython_extension(shell):\n",
"\n",
"The `metadata` module assists in collecting metadata about the interactive compute process.\n",
"It appends the metadata atrribute to the shell.\n",
"\n",
" shell.metadata = Metadata(shell=shell).register()\n",
"\n",
"<!--\n",
"\n",
" import dataclasses, ast, pidgy\n",
" with pidgy.pidgyLoader(lazy=True):\n",
" try: from . import events\n",
" except: import events\n",
"\n",
"-->\n",
"\n",
" @dataclasses.dataclass\n",
" class Metadata(events.Events, ast.NodeTransformer):\n",
" definitions: list = dataclasses.field(default_factory=list)\n",
" def pre_execute(self):\n",
" self.definitions = []\n",
"\n",
" def visit_FunctionDef(self, node):\n",
" self.definitions.append(node.name)\n",
" return node\n",
"\n",
" visit_ClassDef = visit_FunctionDef\n",
"\n",
"<!--\n",
"\n",
" def unload_ipython_extension(shell):\n",
" shell.metadata.unregister()\n",
"\n",
"-->\n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"{{load(pidgy.metadata, 2)}}"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "pidgy 3",
"language": "python",
"name": "pidgy"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
},
"toc-showcode": true
},
"nbformat": 4,
"nbformat_minor": 4
}
tangle_weave_diagram

I believe that the time is ripe for significantly better documentation of programs, and that we can best achieve this by considering programs to be works of literature.

Donald Knuth

Introduction

"Literate programming" is a paper published by Donald Knuth in 1979. It describes a multiobjective, multilingual style of programming that treats programs primarily as documentation. Literate programs have measures along two dimensions:

  1. the literary qualities determined the document formatting language.
  2. the computational qualities determined by the programming language.

The multilingual nature of literate program creates the opportunity for programmers and non-programmers to contribute to the same literature.

Literate programs accept "code" as an integral part of the narrative. "code" signs can be used in places where language lacks just as figures and equations are used in scientific literature. An advantage of "code" is that it can provide augmented representations of documents and their symbols that are tactile and interactive.

Tangle Weave Diagram

The literate program concurrently describes a program and literature. Within the document, natural language and the programming language interact through two different process:

  1. the tangle process that converts to the programming language.
  2. the weave process that converts to the document formatting language.

The original WEB literate programming implementation chose to tangle to Pascal and weave to Tex. pidgy's modern take on literate programming tangles to [Python] and weaves to [Markdown], and they can be written in either [Markdown] files or jupyter notebooks.

[Pascal] was originally chosen for its widespread use throughout education, and the same can be said for the choice of jupyter notebooks used for education in many programming languages, but most commonly [Python]. The preferred document language for the notebook is [Markdown] considering it is part of the notebook schema. CP4E The motivations made the natural choice for a [Markdown] and [Python] programming lanuage. Some advantages of this hybrid are that Python is idiomatic and sometimes the narrative may be explicitly executable.

Literate Programming is alive in places like Org mode for Emacs, RMarkdown, [Pweave], Doctest, or Literate Coffeescript. A conventional look at literate programming will place a focus on the final document. pidgy meanwhile places a focus on the interactive literate computing steps required achieve a quality document.

Originally, pidgy was designed specifically for the notebook file format, but it failed a constraint of not being an existing file. Now pidgy is native for [Markdown] files, and valid testing units. It turns out the [Markdown] documents can provide a most compact representation of literate program, relative to a notebook. And it diffs better.

Design constraints:

  • Use an existing file formats.
  • Minimal bespoke syntax.
  • Importable and testable

A last take on this work is to affirm the reproducibly of enthusiasm when writing literate programs.

The outcome of writing pidgy programs are readable, reusable, and reproducible documents.
pidgy natively supports importing markdown and notebooks as source code.

Modern computing has different pieces of software infrastructure than were available

pidgy kernel install
git+https://github.com/deathbeds/pidgy@edits
pandas
matplotlib
sklearn
@tonyfast
Copy link
Author

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment