PEP 350 -- Codetags | Python.org

Created
Tags
URLhttps://www.python.org/dev/peps/pep-0350/

Contents

This PEP has been rejected. While the community may be interested, there is no desire to make the standard library conform to this standard.

This informational PEP aims to provide guidelines for consistent use of codetags, which would enable the construction of standard utilities to take advantage of the codetag information, as well as making Python code more uniform across projects. Codetags also represent a very lightweight programming micro-paradigm and become useful for project management, documentation, change tracking, and project health monitoring. This is submitted as a PEP because its ideas are thought to be Pythonic, although the concepts are not unique to Python programming. Herein are the definition of codetags, the philosophy behind them, a motivation for standardized conventions, some examples, a specification, a toolset description, and possible objections to the Codetag project/paradigm.

This PEP is also living as a wiki [1] for people to add comments.

Programmers widely use ad-hoc code comment markup conventions to serve as reminders of sections of code that need closer inspection or review. Examples of markup include FIXME, TODO, XXX, BUG, but there many more in wide use in existing software. Such markup will henceforth be referred to as codetags. These codetags may show up in application code, unit tests, scripts, general documentation, or wherever suitable.

Codetags have been under discussion and in use (hundreds of codetags in the Python 2.4 sources) in many places (e.g., c2 [3]) for many years. See References for further historic and current information.

If you subscribe to most of these values, then codetags will likely be useful for you.

  1. As much information as possible should be contained inside the source code (application code or unit tests). This along with use of codetags impedes duplication. Most documentation can be generated from that source code; e.g., by using help2man, man2html, docutils, epydoc/pydoc, ctdoc, etc.
  1. Information should be almost never duplicated -- it should be recorded in a single original format and all other locations should be automatically generated from the original, or simply be referenced. This is famously known as the Single Point Of Truth (SPOT) or Don't Repeat Yourself (DRY) rule.
  1. Documentation that gets into customers' hands should be auto-generated from single sources into all other output formats. People want documentation in many forms. It is thus important to have a documentation system that can generate all of these.
  1. The developers are the documentation team. They write the code and should know the code the best. There should not be a dedicated, disjoint documentation team for any non-huge project.
  1. Plain text (with non-invasive markup) is the best format for writing anything. All other formats are to be generated from the plain text.

Codetag design was influenced by the following goals:

  1. Comments should be short whenever possible.
  1. Codetag fields should be optional and of minimal length. Default values and custom fields can be set by individual code shops.
  1. Codetags should be minimalistic. The quicker it is to jot something down, the more likely it is to get jotted.
  1. The most common use of codetags will only have zero to two fields specified, and these should be the easiest to type and read.

This shows a simple codetag as commonly found in sources everywhere (with the addition of a trailing <>):

# FIXME: Seems like this loop should be finite. <>
while True: ...

The following contrived example demonstrates a typical use of codetags. It uses some of the available fields to specify the assignees (a pair of programmers with initials MDE and CLE), the Date of expected completion (Week 14), and the Priority of the item (2):

# FIXME: Seems like this loop should be finite. <MDE,CLE d:14w p:2>
while True: ...

This codetag shows a bug with fields describing author, discovery (origination) date, due date, and priority:

# BUG: Crashes if run on Sundays.
# <MDE 2005-09-04 d:14w p:2>
if day == 'Sunday': ...

Here is a demonstration of how not to use codetags. This has many problems: 1) Codetags cannot share a line with code; 2) Missing colon after mnemonic; 3) A codetag referring to codetags is usually useless, and worse, it is not completable; 4) No need to have a bunch of fields for a trivial codetag; 5) Fields with unknown values (t:XXX) should not be used:

i = i + 1   # TODO Add some more codetags.
# <JRNewbie 2005-04-03 d:2005-09-03 t:XXX d:14w p:0 s:inprogress>

This describes the format: syntax, mnemonic names, fields, and semantics, and also the separate DONE File.

Each codetag should be inside a comment, and can be any number of lines. It should not share a line with code. It should match the indentation of surrounding code. The end of the codetag is marked by a pair of angle brackets <> containing optional fields, which must not be split onto multiple lines. It is preferred to have a codetag in # comments instead of string comments. There can be multiple fields per codetag, all of which are optional.

In short, a codetag consists of a mnemonic, a colon, commentary text, an opening angle bracket, an optional list of fields, and a closing angle bracket. E.g.,

# MNEMONIC: Some (maybe multi-line) commentary. <field field ...>

The codetags of interest are listed below, using the following format:

To do: Informal tasks/features that are pending completion. FIXME (XXX, DEBUG, BROKEN, REFACTOR, REFACT, RFCTR, OOPS, SMELL, NEEDSWORK, INSPECT) Fix me: Areas of problematic or ugly code needing refactoring or cleanup. Bugs: Reported defects tracked in bug database. Will Not Be Fixed: Problems that are well-known but will never be addressed due to design problems or domain limitations. Requirements: Satisfactions of specific, formal requirements. Requests For Enhancement: Roadmap items not yet implemented. Ideas: Possible RFE candidates, but less formal than RFE. Questions: Misunderstood details. Alerts: In need of immediate attention. Hacks: Temporary code to force inflexible functionality, or simply a test change, or workaround a known problem. Portability: Workarounds specific to OS, Python version, etc. Caveats: Implementation details/gotchas that stand out as non-intuitive. Notes: Sections where a code reviewer found something that needs discussion or further investigation. Frequently Asked Questions: Interesting areas that require external explanation. Glossary: Definitions for project glossary. See: Pointers to other code, web link, etc. Needs Documentation: Areas of code that still need to be documented. Credits: Accreditations for external provision of enlightenment. Status: File-level statistical indicator of maturity of this file. Reviewed: File-level indicator that review was conducted.

File-level codetags might be better suited as properties in the revision control system, but might still be appropriately specified in a codetag.

Some of these are temporary (e.g., FIXME) while others are persistent (e.g., REQ). A mnemonic was chosen over a synonym using three criteria: descriptiveness, length (shorter is better), commonly used.

Choosing between FIXME and XXX is difficult. XXX seems to be more common, but much less descriptive. Furthermore, XXX is a useful placeholder in a piece of code having a value that is unknown. Thus FIXME is the preferred spelling. Sun says [4] that XXX and FIXME are slightly different, giving XXX higher severity. However, with decades of chaos on this topic, and too many millions of developers who won't be influenced by Sun, it is easy to rightly call them synonyms.

DONE is always a completed TODO item, but this should probably be indicated through the revision control system and/or a completion recording mechanism (see DONE File).

It may be a useful metric to count NOTE tags: a high count may indicate a design (or other) problem. But of course the majority of codetags indicate areas of code needing some attention.

An FAQ is probably more appropriately documented in a wiki where users can more easily view and contribute.

All fields are optional. The proposed standard fields are described in this section. Note that upper case field characters are intended to be replaced.

The Originator/Assignee and Origination Date/Week fields are the most common and don't usually require a prefix.

This lengthy list of fields is liable to scare people (the intended minimalists) away from adopting codetags, but keep in mind that these only exist to support programmers who either 1) like to keep BUG or RFE codetags in a complete form, or 2) are using codetags as their complete and only tracking system. In other words, many of these fields will be used very rarely. They are gathered largely from industry-wide conventions, and example sources include GCC Bugzilla [5] and Python's SourceForge [6] tracking systems.

List of Originator or Assignee initials (the context determines which unless both should exist). It is also okay to use usernames such as MicahE instead of initials. Initials (in upper case) are the preferred form. List of Assignee initials. This is necessary only in (rare) cases where a codetag has both an assignee and an originator, and they are different. Otherwise the a: prefix is omitted, and context determines the intent. E.g., FIXME usually has an Assignee, and NOTE usually has an Originator, but if a FIXME was originated (and initialed) by a reviewer, then the assignee's initials would need a a: prefix. The Origination Date indicating when the comment was added, in ISO 8601 [2] format (digits and hyphens only). Or Origination Week, an alternative form for specifying an Origination Date. A day of the week can be optionally specified. The w suffix is necessary for distinguishing from a date. Due Date (d) target completion (estimate). Or Due Week (d), an alternative to specifying a Due Date. Priority (p) level. Range (N) is from 0..3 with 3 being the highest. 0..3 are analogous to low, medium, high, and showstopper/critical. The Severity field could be factored into this single number, and doing so is recommended since having both is subject to varying interpretation. The range and order should be customizable. The existence of this field is important for any tool that itemizes codetags. Thus a (customizable) default value should be supported. Tracker (t) number corresponding to associated Ticket ID in separate tracking system.

The following fields are also available but expected to be less common.

Category (c) indicating some specific area affected by this item. Status (s) indicating state of item. Examples are "unexplored", "understood", "inprogress", "fixed", "done", "closed". Note that when an item is completed it is probably better to remove the codetag and record it in a DONE File. Development cycle Iteration (i). Useful for grouping codetags into completion target groups. Development cycle Release (r). Useful for grouping codetags into completion target groups.

To summarize, the non-prefixed fields are initials and origination date, and the prefixed fields are: assignee (a), due (d), priority (p), tracker (t), category (c), status (s), iteration (i), and release (r).

It should be possible for groups to define or add their own fields, and these should have upper case prefixes to distinguish them from the standard set. Examples of custom fields are Operating System (O), Severity (S), Affected Version (A), Customer (C), etc.

Some codetags have an ability to be completed (e.g., FIXME, TODO, BUG). It is often important to retain completed items by recording them with a completion date stamp. Such completed items are best stored in a single location, global to a project (or maybe a package). The proposed format is most easily described by an example, say ~/src/fooproj/DONE:

# TODO: Recurse into subdirs only on blue
# moons. <MDE 2003-09-26>
[2005-09-26 Oops, I underestimated this one a bit.  Should have
used Warsaw's First Law!]

# FIXME: ...
...

You can see that the codetag is copied verbatim from the original source file. The date stamp is then entered on the following line with an optional post-mortem commentary. The entry is terminated by a blank line (\n\n).

It may sound burdensome to have to delete codetag lines every time one gets completed. But in practice it is quite easy to setup a Vim or Emacs mapping to auto-record a codetag deletion in this format (sans the commentary).

Currently, programmers (and sometimes analysts) typically use grep to generate a list of items corresponding to a single codetag. However, various hypothetical productivity tools could take advantage of a consistent codetag format. Some example tools follow.

Document Generator Possible docs: glossary, roadmap, manpages Codetag History Track (with revision control system interface) when a BUG tag (or any codetag) originated/resolved in a code section Code Statistics A project Health-O-Meter Codetag Lint Notify of invalid use of codetags, and aid in porting to codetags Story Manager/Browser An electronic means to replace XP notecards. In MVC terms, the codetag is the Model, and the Story Manager could be a graphical Viewer/Controller to do visual rearrangement, prioritization, and assignment, milestone management. Any Text Editor Used for changing, removing, adding, rearranging, recording codetags.

There are some tools already in existence that take advantage of a smaller set of pseudo-codetags (see References). There is also an example codetags implementation under way, known as the Codetag Project [7].



Codetags tend to only rarely have estimated completion dates of any sort. OK, the fields are optional, but you want to suggest fields that actually will be widely used.
Defense:




Some other tools have approached defining/exploiting codetags. See http://tracos.org/codetag/wiki/Links.