AirPair:reprehensible

From Jonathan D. Lettvin

Jump to: navigation, search

Contents

Don't do that!

Introduction

"Premature optimization is the root of all evil (or at least most of it) in programming."(Knuth)[1]

There will always be people who will seek solutions to problems come hell or high water and other people who will wag a finger and look sternly over their lorgnettes at the egregious methods used by the problem solvers. This battle between hackers and pedants inevitably results in a lively history of online discussions containing very strong opinions and, sometimes even, useful examples. A sign that a language is important is the liveliness of these discussions. Trolls also like to join in, saying things like "Don't do that" or "Who cares". Most of us programmers fall somewhere between these extremes, following the rules while the perceived cost is acceptable. This article is about making life easier for Python developers while grinning at trolls.

In this article, a single line of "Don't do that" cleverness is used to simplify dictionary syntax. This syntax helps to attach members and methods to existing class instances without brackets and quotes. It is encapsulated in a class that adds value to this cleverness. Python is already easy to use but this makes it easier yet, increasing Python's already great prototyping speed. More "Don't do that" style is offered in the unit test, also shown in this article, all in service to a compromise between hacking and pedantry. In other words, "go ahead and do what you must" and let the criticisms ensue.

A word of advice: If you program outside the conventional rules, make heavy use of syntax checkers and secondary tools to ensure that the only code that disobeys the rules is precisely the code you intended. All the code in this article has been checked with four separate tools:

  • pep8
  • pyflakes
  • PyChecker
  • pylint

One last thing: comment the rule violation with why you chose the violation and reference URLs or core technology explanations to support it.

Core Programming Process

Core/library programming is a six phase process:

  1. Express a new idea in code and test that it works.
  2. Improve the expression so it can be used reliably.
  3. Put an interface on it so others can use it.
  4. Integrate it into an application where it is useful.
  5. Harden it against anticipated edge and corner cases.
  6. Harden it against unanticipated operational failure in the field.

Each phase has its own "best practices". My personal experience is that phases 1 through 4 (prototyping) are best done in a scripting language while phases 5 and 6 typically involve creating compiled libraries where some effort is put into making the code "small, fast, correct, complete, and secure". For many scripting languages, the shift to compiled should happen after phase 2. Python makes it possible to defer this to phase 5. In my experience, the syntax of Python reduces the amount of keystrokes and head scratches while increasing LLOC[2] per day by a factor of between 4 and 10 over compiled languages, and a factor of 2 over other interpreted languages during phases 1 through 4. I have not experienced better prototyping productivity with any other language. The Dict class offered in this article is a phase 4 module, integrated but not hardened.

Why Python?

Python is a somewhat paradoxical language in the midst of others. On the one hand, one has formal languages having firm syntax and semantics such as LISP, APL, or Ada. On the other hand, one has convenient scripting languages not so much designed as tweaked such as bash, perl, or Python. One gets the feeling that formal design has less to do with what the language will and will not do than the character of its author. There are languages that break important syntax rules to achieve perceived consistency such as C++ [3] [4] [5] [6] [7]. In fact it is impossible to declare a language superior except in its use to develop a solution space for a problem space within constraints. Python is paradoxically useful amongst other languages mainly due to the community surrounding it and the social mores that have evolved for its use. Perhaps the most important social contributions come from scientists who, for unknown reasons, chose to endow Python with mature mathematical and structural libraries (scipy[8] ScientificPython[9] Mayavi[10] and many more). It is not just the social value of PEP8[11] but the actual optimizations in those mature scientific libraries that make Python so important as a reason to choose it over other languages.

Losses due to Impedance Mismatch at Language Interfaces

There is one absolute truth about computer programs. There is a Turing machine implemented in the chip instruction set, with various optimizations depending on the intended hardware application, upon which a language author imposes their personal bias about how that instruction set ought to be used. Typically, languages are layered one upon the other such as C++ atop C atop assembly language atop machine code. Each layer loses something valuable from the layer below it while gaining some advantage in problem solving abstractions. There really are no objective criteria by which to declare any one language "the best", even for its intended use.

"By understanding a machine-oriented language, the programmer will tend to use a much more efficient method; it is much closer to reality."(Knuth)[1]

One of my favorite instructions is the computed GOTO (available in gnu C++ but not in MSVC VC++ in 2008). No lexer can achieve its optimal throughput without this instruction (please prove me wrong). GOTO[12][13] is like a scalpel: important to the surgeon, but kept out of the hands of most children.

Python vs. perl (for example)

All this editorializing aside, Python favors ease of development and maintenance over strict protections and enforcements. It has a set of PEP documents (Python Enhancement Proposals[14]) where arguments are reviewed and decided. It has a very important style guidline called PEP8 [11]. This one guideline is probably responsible for a major difference between Python and perl. In perl cleverness and obfuscation is encouraged and applauded[15]. In Python it is discouraged and criticized (specifically in PEP8). The difference in coding behaviors and proud moments of the two programming communities is stark.

All the important code presented here passes PEP8 guidelines (install and use pep8). Another important syntax checker to use is pyflakes.

The files Dict.py and testdict.py fail syntax checking for the very reason they exists. One hides the use of imported MethodType in an exec string. The other hides a variable in an exec string. This is part of why they belong in a "Don't do that" article.

# easy_install pep8 pyflakes pylint
# pep8 Dict.py
# pyflakes Dict.py
# pylint Dict.py | less

Consider integrating pep8, pyflakes, pylint, and PyChecker into your programmer's editor (i.e. emacs or vi) and development cycle.

Constructing Classes out of Pieces (not pep8/pyflakes/pylint/PyChecker checked)

from types import MethodType
class aClass(object): pass
def aMethod(self): print('A message')
anInstance = aClass()
setattr(anInstance, 'aName', MethodType(aMethod, anInstance, aClass))
anInstance.aName()

Look very carefully at this example.

  1. On line 1 import a necessary type to enable attaching new methods to class instances.
  2. On line 2 a class is defined having no members, no methods, no statics; nothing except structural necessities.
  3. On line 3 a function is defined which has the signature of a method, but not within a class.
  4. On line 4 the class is instanced.
  5. On line 5 the method is made into a member of the instance under a name.
  6. On line 6 the instance.method() call executes the function through the instance.

This pattern runs contrary to safe practice, and this pattern is what will be explored in this article. Duck typing[16] is also unsafe, but is one of the key features underlying Python's simplified development cycle.

Syntax support for Dictionaries

The source for the Dict class (below) is the only code you need to copy from this article. A second source for Argv illustrates the syntax simplification in brief and clear form. A third source for testdict (also below) illustrates some of the capabilities of the Dict class by exercising it with unit tests.

The Dict class employs a trick to make dictionary syntax easier. It takes advantage of the fact that classes have a built-in dictionary that can be used without the cumbersome brackets and quotes syntax. The trick is to substitute the builtin class __dict__ with a dict parent class and give it method support for even denser yet clearer syntax.

The Dict class is the principle focus for this article. Both external API methods, and internal service methods are defined. The goal is to be able to attach new members and methods to an existing instance, with the ability to avoid the cumbersome dictionary syntax with braces and quotes, substituting the much easier lexical syntax of Python for both attachment and use.

PUBLIC API

__init__
updates the instance __dict__ from **kw keyword arguments.
__call__
updates the instance __dict__ from **kw keyword arguments.
let
offers unquoted key setting.
method
enables attaching new methods to existing class instances [17].

PRIVATE API

_clean
removes lexically troubling characters (i.e. from sys.argv)
_method
converts an external function into an internal class method

The __dict__ member of this class may be added to, deleted from and modified at will as long as key token naming conventions for variable names are followed. This considerably simplifies coding tasks by reducing syntax complexity.

The let method automatically removes argv '-' characters. The method method automatically wraps a function to make it a MethodType member of an instance.

Source Code for Dict class (passes pep8 and pyflakes, pylint and PyChecker 1 warning)

The reason pylint raises a warning is the absence of a super-init. This is intentional, since the dictionary will be managed using a different style.

#!/usr/bin/env python

"""Dict.py
Author: Jonathan D. Lettvin
LinkedIn: jlettvin
Date: 20141020
"""

from types import (MethodType)


class Dict(dict):
    """
    This derived class adds the ability to view and modify a dictionary
    using the simpler lexical naming convention used for code token names
    rather than the more complex braces and quotes dictionary accessors.
    """

    remove = '-<>'  # Add characters to this string to get them removed.

    @staticmethod
    def _clean(k):
        """
        This private method removes non-token characters from strings.
        Remove all '-' characters from hashable key to make it lexically ok,
        for instance when converting argv command-line dicts.
        https://docs.python.org/2/library/string.html#string.translate
        """
        return k.translate(None, Dict.remove)

    def _method(self, value):
        """
        This private method converts external functions to internal methods.
        https://docs.python.org/2/library/types.html
        """
        assert type(value) == type(Dict._clean)
        return MethodType(value, self, Dict)

    def __init__(self, **kw):
        """
        __class__ has a __dict__ member.
        By replacing this __dict__ member with the parent dict builtin
        we may access its members lexically instead of with hashable keys.
        """
        self.__dict__ = self                # This is the magic line
        self.let(**kw)

    def __call__(self, **kw):
        'Update the instance dictionary'
        self.let(**kw)
        return self

    def let(self, **kw):
        "instance dictionary bulk updater"
        self.update({Dict._clean(k): v for k, v in kw.items()})
        return self

    def method(self, **kw):
        "instance dictionary bulk method attacher"
        self.update({Dict._clean(k): self._method(v) for k, v in kw.items()})
        return self

Example

Review this code carefully. The two functions to review are test_old and test_new. The two functions are logically identical. The value-add of the Dict class is seen in the simplified syntax of test_new.

Source Code for Example (passes pep8 and pyflakes, pylint gives 2 warnings)

The warnings flagged by pylint indicate use of the magic '**' prefix when passing arguments. This is expected and acceptable.

#!/usr/bin/env python

"""Argv.py

Usage:
    Argv.py [(-v|--verbose)] <arg>...
    Argv.py (-h | --help)
    Argv.py (--version)

Options:
    -v, --verbose                   Show execution details [default: False]
    -h --help                       Show this screen
    --version                       Show version

Example:
    ./Argv.py -v Hello world.

Author: Jonathan D. Lettvin
LinkedIn: jlettvin
Date: 20141020
"""

if __name__ == "__main__":
    from docopt import (docopt)
    from Dict import (Dict)

    def test_old(**kw):
        "This function tests standard dictionary operations."
        kw.update({'title': '\told_dict', 'doc': 'illustrates standard dict'})
        if kw['--verbose']:
            print kw['title'], kw['doc']
            print kw['--verbose']
            print kw['<arg>']

    def test_new(argv):
        "This function test new Dict class dictionary operations."
        argv(title='\tnew_Dict', doc='illustrates new Dict class.')
        if argv.verbose:
            print argv.title, argv.doc
            print argv.verbose
            print argv.arg

    def main():
        "This function is the root of execution."
        kwargs = docopt(__doc__, version="0.0.1")
        argv = Dict(**kwargs)
        test_old(**kwargs)
        test_new(argv)

    main()

Unit Test

Explanation

Standard python dictionaries are very useful. However, sometimes it is more convenient to use unquoted class member names to gain access to data stored in dictionaries rather than brackets and quoted strings.

This module exercises the conversion from dictionary to member name.

Section 1 __doc__ and unit test support for module

In this module docopt [18] is used to parse command-line args. In practice, the conventions for docopt interfere with those for unittest2. A simpler unit test approach is implemented in the Compare class which uses a functor to report progress and announce comparison failures. In this module, no comparison failures occur. The reporting mechanism is useful for rapid development with a convenient [PASS] and [FAIL] message (colored if desired) accompanying the execution stage message.

The module __doc__ string is used by docopt to configure and constrain command-line args. This style of access is simpler than traditional python dictionary accessing using hashables.

Section 2 import Dict class

This is simply an import.

Section 3 test_0000

This is the meat of this module. A series of functions, each of which return a container of new and old style dictionaries, are run by the Compare functor, and equality reported, showing proper operation of the new Dict class.

Section 4 __main__ section

The module execution as a command-line program is prepared with imports and service functions.

Section 5 main function

The docopt command-line parser is run.

Section 6 main call

This is the root of execution other than the singleton instancing of the Compare class as REPORT.

Source Code for Unit Test (passes pep8, fails pyflakes, pylint, and PyChecker)

The reason for pyflakes failing is that an imported variable is tested by exec, not directly. The reasons for pylint and PyChecker failing are for acceptable reasons.

R: 44, 0: Too few public methods (1/2) (too-few-public-methods)
W:212,19: Used * or ** magic (star-args)
W:212,31: Used * or ** magic (star-args)
W:217,19: Unused argument 'self' (unused-argument)
W:254,11: Use of eval (eval-used)
W:267, 8: Used * or ** magic (star-args)
W:270, 8: Used * or ** magic (star-args)
W:249, 4: Unused import version_info (unused-import)

#!/usr/bin/env python

# Section 1 -------------------------------------------------------------------
"""testdict.py

Usage:
    testdict.py \
[(-c<Hues>|--color=<Hues>)]\
[(-v|--verbose)]\
[(-w<W>|--width=<W>)]
    testdict.py (-h | --help)
    testdict.py (--version)

Options:
    -c <Color>, --color=<Color>     Print colors for pass/fail [default:"00"]
    -w <Width>, --width=<Width>     Print width of conclusion [default:79]
    -v, --verbose                   Show execution details [default: False]
    -h --help                       Show this screen
    --version                       Show version

This module is designed as a teaching exercise.
Although the new Dict class is operational and servicable
it is not necessarily complete and matured for all possible uses.

<Color> is expected to be a pair from the set '0rgybmcw':
    0: black  r: red      g: green  y: yellow
    b: blue   m: magenta  c: cyan   w: white
    where the first color is used for FAIL and the second for PASS
    against the current default background color.
    (i.e. --color=rg makes FAIL appear in red and PASS appear in green).
    If not given on the command-line or set to "00",
    no ANSI color formatting is used, to enable text editor review of output.
    http://en.wikipedia.org/wiki/ANSI_escape_code

Example:
    ./testdict.py --color=rg

Author: Jonathan D. Lettvin
LinkedIn: jlettvin
Date: 20141020
"""


class Compare(object):
    """
    This class presents a singleton functor and conclusion interface.
    The functor compares pairs of adjacent objects expecting equality.
    The conclusion prints the result of all preceding functor compares
    in a clean and easy-to-scan review format.
    This class provides limited local unit test support to avoid
    command-line argument parsing conflict between the libraries named
    unittest2 and docopt.
    ANSI colors may be used to enhance unit test output via command-line args.
    """

    Width, Pass, Fail, Verbose, Text, Color = 79, 0, 0, True, [], "00"
    colorList = {k: n for n, k in enumerate('0rgybmcw')}
    ANSIformat = '\x1b\x5b3%sm'
    COLOR = [ANSIformat % (hue) for hue in (0, 0, 0)]

    @staticmethod
    def _recolor(i, hue):
        "Update ANSI sequences for output"
        assert 0 <= i <= 1 and 0 <= hue <= 7
        Compare.COLOR[i] = Compare.ANSIformat % (hue)

    @staticmethod
    def __call__(msg, *args, **kw):
        """
        This functor compares adjacent pairs of objects in its arg list.
        args is a list of objects to compare.
        kw is the key:value pairs from the docopt command-line dictionary.
        If there is no pair, the message is added without compares.
        """
        Compare.Verbose = kw.get('--verbose', Compare.Verbose)
        Compare.Width = kw.get('--width', Compare.Width)
        # Update ANSI color usage if default and proposed on command-line.
        if Compare.Color == "00":
            # This conditional handles deferred command-line arg handling.
            if '--color' in kw.keys():
                color = kw.get('--color', Compare.Color)
                Compare.Color = color if color else Compare.Color
                assert len(Compare.Color) == 2
                assert all([h in Compare.colorList for h in Compare.Color])
            for index, char in enumerate(Compare.Color):
                assert char in Compare.colorList.keys()
                color = Compare.colorList[char]
                Compare._recolor(index, Compare.colorList[char])
        # Update width of output if proposed on command-line.
        Compare.Width = int(Compare.Width if Compare.Width else 79)
        if not args:
            # Add message without PASS/FAIL if args are absent.
            if Compare.Verbose:
                Compare.Text.append([False, '       %s' % (msg)])
            return
        else:
            # Compare args if present and indicate PASS/FAIL
            try:
                largs = list(args)
                assert all([(b == c) for b, c in zip(largs[:-1], largs[1:])])
                Compare.Text.append([True, '%(PASS)s ' + msg])
                Compare.Pass += 1
            except AssertionError:
                Compare.Text.append([True, '%(FAIL)s ' + msg])
                Compare.Fail += 1
        for arg in args:
            # Add display of tested args if --verbose
            Compare.Text.append([False, arg])

    @staticmethod
    def conclusion():
        "This method assembles a report into a printable form."

        def rule(retval, **kw):
            "This function makes a horizontal line with a title (if given)."
            msg, char = (kw.get(k, v) for k, v in [('msg', ), ('char', '-')])
            msg = ' %s ' % (msg) if msg else char*2
            retval += char*6 + msg + char*(Compare.Width-len(msg)-6) + '\n'
            return retval

        # Start with a horizontal line.
        retval = rule(, msg='INDIVIDUAL')
        # Loop through accumulated proposed outputs.
        for show, text in Compare.Text:
            if not Compare.Verbose:
                if show:
                    retval += '%s\n' % (text)
            else:
                retval += '%s\n' % (text)
        # Add another horizontal line.
        retval = rule(retval, msg='SUMMARY')
        # Add summaries of PASS/FAIL counts and REVIEW flag if FAILS.
        review = '   REVIEW!' if Compare.Fail else 
        retval += '%(PASS)s' + ' count = %d\n' % (Compare.Pass)
        retval += '%(FAIL)s' + ' count = %d %s\n' % (Compare.Fail, review)
        # Add a final horizontal line with a timestamp.
        import datetime
        retval = rule(retval, msg=datetime.datetime.now())
        # Setup string formatting to substitute ANSI colors where proposed.
        failpass = ['[FAIL]', '[PASS]']
        if Compare.Color != "00":
            failpass = [
                Compare.COLOR[i]+failpass[i]+Compare.COLOR[2]
                for i in range(2)]
        # Use string formatting to substitute ANSI colors where proposed.
        return retval % {'FAIL': failpass[False], 'PASS': failpass[True]}

# Create a compare singleton instance to hold unit test results.
# Use it to declare the previous section immediately.
REPORT = Compare()
REPORT('Section 1   docopt and unit test setup')

REPORT('Section 2   definition of Dict class')
# Section 2 -------------------------------------------------------------------

from Dict import (Dict)
from types import (MethodType)

REPORT('Section 3   definition of unit test function')
# Section 3 -------------------------------------------------------------------


def test_0000(**kw):
    """
    This function is the typically named unit test entry point.
    Numbered subtest functions are defined which return pairs of objects
    which are supposed to compare as equal.
    A dictionary at the bottom collects and names the subtests.
    A function named 'act' executes a single test with further labeling.
    A loop marches through the dictionary executing the tests with 'act'.
    """

    def testfunction0():
        "Compare after default initialization."
        return (Dict(), dict())  # Does new class compare to encapsulated one?

    def testfunction1():
        "Compare after standard initialization."
        return (Dict(a=1, b=2), dict(a=1, b=2))  # How about initialized?

    def testfunction2():
        "Compare after lexical access on new class."
        dict1, dict2 = dicts12 = (Dict(a=1, b=2), dict(a=1, b=2))
        dict1.a = 3
        dict2['a'] = 3
        return dicts12  # How about updated after initialized?

    def testfunction3():
        "Compare after new class updating method."
        dict1, dict2 = dicts12 = (Dict(a=1, b=2), dict(a=1, b=2))
        dict1.let(a=5, b=6, c=-1)
        dict2.update({'a': 5, 'b': 6, 'c': -1})
        return dicts12  # How about bulk updating?

    def testfunction4():
        "Check new class acts like standard dict."
        dict1, dict2 = dicts12 = (Dict(a=1, b=2), dict(a=1, b=2))
        dict1.update({'a': 7, 'b': 8, 'c': -2, 'd': -3})
        dict2.update({'a': 7, 'b': 8, 'c': -2, 'd': -3})
        return dicts12  # How about old-style updating on new class?

    def testfunction5():
        "Compare lexical and keyword accessing."
        dict1, dict2 = dicts12 = (Dict(), dict())
        dict1.hello = 'hello world'
        dict2['hello'] = 'hello world'
        return dicts12  # How about adding lexically vs. dictionary?

    def testfunction6():
        "Compare converting command-line arguments."
        converted = {k.replace('-', ): v for k, v in kw.items()}
        dicts12 = (Dict(**kw), dict(**converted))
        return dicts12  # How about command-line args with '-' characters?

    def testfunction7():
        "Compare execution after method attachment."
        def cogito(self, msg):
            "This function will be converted to a method in both dictionaries."
            return '%s %s' % (therefore, msg)
        therefore, exist = 'ergo', 'sum'
        dict1, dict2 = Dict(), Dict()
        dict1.Ithink = MethodType(cogito, dict1, Dict)
        dict2.method(Ithink=cogito)
        string0 = ' '.join([therefore, exist])
        string1, string2 = dict1.Ithink(exist), dict2.Ithink(exist)
        return (string0, string1, string2)  # How about adding a method?

    def act(msg, fun):
        "Run compare function on test function results."
        REPORT('Section 3.%s %s' % (msg, fun.__doc__), *fun())

    # Here is a dictionary of labelled test functions to run.
    test = {'0: empty dictionary': testfunction0,
            '1: initialized vals': testfunction1,
            '2: separate updates': testfunction2,
            '3: bulk data update': testfunction3,
            '4: old styles works': testfunction4,
            '5: lexical updating': testfunction5,
            '6: command-line arg': testfunction6,
            '7: attached methods': testfunction7, }

    # Here is a loop to execute al the labelled test functions.
    for key in sorted(test.keys()):
        act(key, test[key])

# Section 4 -------------------------------------------------------------------
if __name__ == "__main__":
    REPORT('Section 4   entering __main__ section')
    from sys import (version_info)
    from docopt import (docopt)

    def exitif(condition):
        "This function identifies untested python versions, and exits."
        if eval(condition):
            print "%s: Sorry, only python version 2.7 is tested" % (condition)

# Section 5 -------------------------------------------------------------------
    REPORT('Section 5   defining main() function to setup/run unit test')

    def main():
        """
        This is the root of execution.
        When executed from the command-line, this section is the entry-point.
        http://docopt.org/
        """
        kwargs = docopt(__doc__, version="0.0.1")
        REPORT('Section 4.1 set verbosity flag', **kwargs)
        exitif("version_info.major != 2")
        exitif("version_info.minor != 7")
        test_0000(**kwargs)
        print REPORT.conclusion()

# Section 6 -------------------------------------------------------------------
    REPORT('Section 6   execute main and display unit test final result')
    main()

Conclusion

There really are no objective Do's and Don'ts in programming. There are aesthetics with strong proponents and opponents. Use what works. Make it useful. Don't make it cost too much (unless it pleases you to do so). Accept criticism gladly, because surely you will be criticized. There will always be trolls.


References

  1. 1.0 1.1 http://en.wikiquote.org/wiki/Donald_Knuth
  2. http://en.wikipedia.org/wiki/Source_lines_of_code
  3. http://www.stroustrup.com/
  4. http://synesis.com.au/publishing/imperfect/cpp/
  5. http://stackoverflow.com/questions/243383/why-cant-c-be-parsed-with-a-lr1-parser
  6. http://stackoverflow.com/questions/14589346/is-c-context-free-or-context-sensitive
  7. http://www.computing.surrey.ac.uk/research/dsrg/fog/FogThesis.pdf
  8. http://www.scipy.org/
  9. http://dirac.cnrs-orleans.fr/plone
  10. http://docs.enthought.com/mayavi/mayavi/
  11. 11.0 11.1 http://legacy.python.org/dev/peps/pep-0008/
  12. http://en.wikipedia.org/wiki/Considered_harmful
  13. https://github.com/jlettvin/UnsignedLongLongLexer
  14. http://legacy.python.org/dev/peps/
  15. http://www.perlmonks.org/?node=Obfuscated+Code
  16. http://en.wikipedia.org/wiki/Duck_typing
  17. http://www.ianlewis.org/en/dynamically-adding-method-classes-or-class-instanc
  18. http://docopt.org/
Personal tools