From Jonathan D. Lettvin
Jonathan D. Lettvin
Big Data Mathematics
I attack/solve "insoluble" Big Data problems and generate unexpected successes (see my resumé for examples). I have loved big data since before it was named. I use novel mathematical methods where databasing would obscure features. I think about it, visualize it, detect events in it, and discover monetizable value in it where others see only noise.
To achieve my results, I develop high-quality, high-performance code that is small, fast, correct, complete, and secure. My code is targeted for zero-bugs, unit-tests, edge-cases, 100% coverage, and orders of magnitude performance boosts. I comment and document heavily. I prefer to collaborate, but work very well autonomously.
I have experience with many languages, editors, platforms, etc... Learning new tools rarely improves my problem-solving abilities and usually takes time away from solving really interesting problems with the tools I already have. Learning Python was one of those rare exceptions because a strong scientific community has evolved excellent, optimized, and easily used libraries. In the spirit of avoiding premature optimizations, Python is my preferred prototyping language.
When performance issues arise, I switch to C++ and Intel assembly. I use Virtualbox to sandbox environments. My preferred development environment includes linux, git, mediawiki, make, and gvim. I'm interested in gpu and map/reduce. I am experienced in architecting and implementing operating systems, editors, compilers, data ingesters, lattice math, and feature detection.
Design Patterns (editorial)
All code requires patterned thinking. Patterned thinking arose long before the modern classification system. I used most of the patterns before they were named. Lazy Initialization, Multition, Object Pool, RAII, Adapter, Composite, Decorator, Flyweight, Module, Chain of Responsibility, Command, Interpreter, Iterator, Memento, Null Object, Observer, Publisher Subscriber, State, Strategy, Servant, Template, Visitor, Active Object, Balking, Event Based Asynchronous, Join, Lock, Messaging, Monitor Object, Reactor, Read Write Lock Scheduler, Thread Pool, Thread Specific Storage . I find Design Patterns limiting and I find the naming conventions ambiguous. Much of what is called Style is actually a set of additional Meta-Design-Patterns.
I think of programming as a three-part process of creating an impedance match between a mechanism of inputs, a mechanism of transforms, and a mechanism of outputs. A problem occupies a "problem space" in which a "preferred language" can be designed. A "solution space" is the architecture that fits the problem space to available resources. It is often possible to have a closed form solution and a simple high-performance implementation. Sometimes Design Patterns can be used to guide development, sometimes they cannot. Sometimes a focus on Design Patterns interferes with discovering optimal solutions.
One of my better pieces of code in C++ is an LR1 lexer with two branch points. The closest design pattern is "Interpreter", but it is a stretch. It implements a closed solution to converting a string representation to an unsigned long long. It sounds like an easy problem to solve, but it is subtle. To achieve high performance with intrinsic edge-casing is a challenge.
Another piece of good code I wrote in Python is a Roman Numeral converter in any base between 7 and 60. This code solves its problem completely in that numbers expressible in Roman Numerals has both a lower and upper limit, so the solution can be tested by brute force methods. It doesn't fit any Design Pattern, yet it solves a problem using a useful pattern.
I would claim that the existing set of Design Patterns is missing some critical types. For instance, I would claim that code is incomplete without unit tests, edge casing, and coverage. Both of the above pieces of code use that pattern to achieve 100% coverage and error handling. Neither has dead code, extra code, or failing code. Self-Test is a Design Pattern.
Another missing Design Pattern is Disaster-Recovery or ACID. When a program fails, it often leaves residual errors that require cleanup. Some programs from the 1970s were designed to be 100% recoverable. The only possible losses arose from incomplete transmissions or storage losses and corruptions. Geographically distributed bit dispersion storage reduces the latter problem considerably. Databases are designed using ACID criteria (Atomicity, Consistency, Isolation, Durability). These critically important criteria are only partially conveyed in Design Patterns.
So, although knowledge of Design Patterns is a basic test of understanding coding I find that it is a low bar, and that the bar ought to be set higher. I have been in only one interview where the focus was on unit testing and edge casing. The resulting code tenpin was the result and, again, I use Self-Test as a design pattern. All of these modules exhibit another unnamed Design Pattern I would name Problem-Definition. A module is incomplete if it does not clearly express its problem space. Python enables comments to contain runnable unit tests. These can be used to illustrate the Problem-Definition.
My clients/employers include Carbonite, Lotus, IBM, NASA, MIT, and many small high tech startups. I have patents, and publications, and have contributed to project success in many arenas. These include canonicalization, virus search, high speed lexing, discrete convolution/correlation, efficiency calculations, dimensional conversion, and automated generation of code and papers.
My personal goal is to answer Cajal's three questions about nervous systems (ISBN 0-19-507401-7 Histology of the Nervous System): "Practitioners will only be able to claim that a valid explanation of a histological observation has been provided if three questions can be answered satisfactorily: what is the functional role of the arrangement in the animal; what mechanisms underlie this function; and what sequence of chemical and mechanical events during evolution and development gave rise to these mechanisms." Santiago Ramón y Cajal
Principally, I work on the first two questions: I model observed groups of shaped neurons. I model observed signal propagation and expression. I replicate observed functional roles. As a personal Python OpenCV Big Data project I research and develop neuron-shaped mathematical transforms, as discrete 3D convolution/correlation kernels achieving far sub-pixel image feature detection, using methods learned during my MIT Physics training and early experience in a wet neuroscience lab. ignore old page
|Action Places||godaddy||amazon||BIDMC||DUA||airpair||bitbucket eye||bitbucket||auction||bookmarks||GTFOOD|
|Latest||Off Color||Synchronic||Seeing Stars||Many Points||LTFS|
The unknown only becomes known by one who makes mistakes. For example, all advances in science are achieved by violating inviolable rules, whether intentionally or not. It is a common theme to be found in the nobel lectures by the prize winners (http://www.nobelprize.org). As I like to say "You know the value of your mistake by the size of the army mounted against you".