WACD: A Concept Worth Passing Along 2

Posted by David Richards Fri, 08 Jan 2010 18:57:00 GMT

I thought this article was funny and to the point: cowboy coding doesn't work.

A Bayesian Mind

Posted by David Richards Sat, 26 Dec 2009 08:48:58 GMT

Since I received Causality for Christmas, I've been thinking a lot more about Bayesian inference. I thought I'd jot down a few notes to see if these things are clearly outlined in my head.

Causality

Posted by David Richards Sat, 26 Dec 2009 06:31:37 GMT

I've been reading Causality by Judea Pearl, one of my Christmas presents this year. It's an amazing read. I started with the epilogue, to get the grand view of the book from a researcher's perspective. The epilogue's contents can actually be found here.

The epilogue is the transcript from a lecture that Pearl gave to UCLA researchers around the time of the first edition of the book. In it, Pearl outlines the history and issues of causality.

Blogging from TextMate

Posted by David Richards Sat, 26 Dec 2009 06:06:51 GMT

Removing yet another barrier in my life, I've figured out how to blog from TextMate. A screencast demonstrating how to do this is found here. There were a few issues I have yet to work out:

  • I've had some issues with the post timing out on a new post. It makes it there all right. But it doesn't seem to want to handle things very well.
  • I have some issues with Typo recognizing my Markdown. I followed the ideas on this post, but that doesn't seem to be it.

If anyone has used TextMate with Typo, I'd love a comment or two. Google doesn't seem to have much to say on the subject tonight.

Mahout Version 0.2 is Out

Posted by David Richards Fri, 20 Nov 2009 01:18:00 GMT

I've been watching "Mahout":http://lucene.apache.org/mahout/ for a little while. I'm very impressed with all the activity they have going on. Mahout is a machine learning framework for the "Hadoop":http://hadoop.apache.org/ framework. They seem to be doing the thing that Tegu only is aware of, at this point. "Alex Handy":http://www.sdtimes.com/author/ahandy.aspx seems to be enthused as well, with his "review of Mahout":http://www.sdtimes.com/blog/post/2009/11/19/We-are-the-big-data-problem.aspx

I think this is the one to watch. I'll be posting my experiments with the framework as they mature and are more presentable.

On HAML and jQuery 2

Posted by David Richards Sat, 14 Nov 2009 07:29:00 GMT

I just got carried away, responding to "another blog post":http://www.toohardtopronounce.com/code/2009/05/02/on-haml-and-jquery/. I spent enough time on that comment, I thought I'd post it here as well:

When I found HAML at its inception, I found something powerful that I couldn't explain. I told people it was because I didn't have as much noise in my templates, but there was something more powerful about it than just that. I've realized that simplicity is a form of power: it saves me time, but it also begins to unveil meaning to me.

A simple thing embodies the task at hand. The egg-shaped handle for a kitchen utensil, seems to say, "hold me." Quiet HAML seems to say, "shape meaning with me." As a handle is built for holding, I realize that template languages are for structuring. HAML invites me to engage in the meaning of a document by seeing the structure rather than the syntax.

And, I find that using HAML, I adapt my templates more quickly. That's partly because it's much more expensive to change HTML than to write it. HAML is easier to write, but MUCH easier to edit. If I move something with HAML, I don't have to chase down a closing bracket, somewhere else on a page. I just move the content I am working on.

The web is not HTML, it is a universal expression of meaning. We build web applications and pages because we mean something by them. HAML, then, didn't add an abstraction layer, it removed one. It did this by the laws of reduction: shrink, hide, and embody.

Yes, there are those pesky moments when I miss a space, or a tab sneaks into my white space somehow, and all hell breaks loose. I have to go and find out what's wrong before I can move forward. The act of keeping my HAML pristine, however, leaves me with a sense that I actually accomplished my purpose, I presented my meaning in a clean and orderly way.

jQuery seems to do the same. It is not Javascript, and so doesn't try to take over the Javascript space. It humbly offers its services, unless they are otherwise engaged. Non-jQuery objects become jQuery objects with a simple $() wrapper around them. Lists and single items behave in the same ways, so that unnecessary object inspection is hidden away from my code. The selectors simply embody most of the selectors I've already come to know. jQuery doesn't try to add layers, but remove them. It implements server communication through GETs and POSTs with $.get and $.post. It makes it as easy to extend the library as it is to use it. It presents itself as a designed library, rather than just a slipshod collection of methods and features.

A Context for Classification

Posted by David Richards Sun, 08 Nov 2009 17:48:00 GMT

When I discovered systems and system archetypes and dynamic models and these kinds of things, I fell in love. I moved to Portland, I ate the stuff up. Life happened since then, and I'm addressing other issues, but I still have this love affair with seeing the integration of elements of a system. When the purpose of a system manifest itself from the whole of its partsm it's like a ballet to me.

An example from my boyhood is the systems that produce amusement parks. Once I could begin to see the forces at play that combined physics, engineering, economics, and the pursuit of pleasure to create an amusement park, that idea was immensely more pleasurable than the rides themselves. As a boy of around twelve years old, the moment of riding a roller coaster was just an interruption from the joy of realizing the systems all around me.

A friend saw some of this the other day. He walked into my office and wanted to talk about how he took some of my ideas from my "UTOSC talk":http://blog.tegugears.com/2009/10/08/utosc-resources to cluster the voters in the Utah Republican Party. He could start to see that there were forces at play behind the votes. These voters had purposes of their own, but as a whole, the system began to manifest a purpose and direction for itself. The structure of the elements of the Utah Republican Party guide its behavior.

I could be wrong, but I think that's why we get excited about classification. Simple linear regression can often draw a line between two classes in fairly useful ways. Neural networks, support vector machines, Gaussian processes, decision trees, KD Trees, all these wonderful inventions begin to tease out the players of a system. We can't always see these things from just the data, and a priori information needs to be asserted, but we live in a world where the common man can work on these things.

"Cherkassky and Mulier":http://www.amazon.com/Learning-Data-Concepts-Theory-Methods/dp/0471681822/ref=sr11?ie=UTF8&s=books&qid=1257704359&sr=8-1 are more exact when they explain that

bq. Learning is the process of estimating an unknown (input, output) dependency or structure of a System using a limited number of observations.

In other words, from observation, we learn how a system might turn inputs into outputs. The knowledge of the properties of steel and the forces of nature (physical inputs) can guide the creation of roller coasters and machines that flip us around in death-defying ways and give us a thrill of a lifetime (system outputs). For the price of a small ticket (our economic inputs), we can share in the accumulation of thousands of hours and millions of dollars to share in those thrills (system outputs).

What's more important, these systems can be generalized, to a point. Disney can create the happiest place on earth in Orlando and Anaheim, yet fall quite flat in Paris. The Harvard business case suggests they didn't react to the observations available to them on that project.

"Cherkassky and Mulier":http://www.amazon.com/Learning-Data-Concepts-Theory-Methods/dp/0471681822/ref=sr11?ie=UTF8&s=books&qid=1257704359&sr=8-1 go further to explain

bq. Under [the] statistical model estimation framework, the goal of learning is accurate identification of the unknown system, whereas under predictive learning the goal is accurate imitation of a system's output.

Those are the first steps in a difficult and rewarding journey through the world of data analysis, as guided by "Cherkassky and Mulier":http://www.amazon.com/Learning-Data-Concepts-Theory-Methods/dp/0471681822/ref=sr11?ie=UTF8&s=books&qid=1257704359&sr=8-1. They pick apart and give us a great context for classification methods. Above, they show us that the statistical model estimation framework wants to point out means and distributions and skew and kurtosis. The machine learning world is more interested in simply knowing what predictive power the observations might have. I.e., it's enough to know what a system does, rather than all about how it does it.

I introduce this book because you may want to actually get somewhere with your work.

Another way to get somewhere is to put classification in the context of the "Laws of Simplicity":http://lawsofsimplicity.com/, a framework from "John Maeda":http://www.maedastudio.com/index.php. The models I describe above reduce a complex systems to a few inputs and outputs. This is the first law of simplicity. We know that the reduced model isn't accurate, but it's more useful than a complete model. It suggests trends and decisions and distinctions, where a complete model looks complex and chaotic and undetermined.

The way things should be reduced, says Maeda, is by SHE:

  • Shrink
  • Hide
  • Embody

When classifying or learning a system, we shrink its parameters. We use "Principal Component Analysis":http://en.wikipedia.org/wiki/Principalcomponentanalysis, "Reconstructability Analysis":http://www.sysc.pdx.edu/download/papers/ldlpitfabstract.htm or other "parsimony methods":http://hunch.net/~jl/projects/reductions/reductions.html to make the problem tractable. A business person or research assistant can't often use a model that takes the coordination and harmony of 92 input variables, but they can work with one with three. If a three-parameter model is still useful, then it should be preferred over more complex models. The methodologies mentioned above propose ways of deciding how much to shrink a model.

Classification can also do a good job of hiding some of the complexity. Consider the structure of a neural network. There are understandable inputs, and desired outputs, and one or more hidden layers in between. Support Vector Machines create a mapping between n-dimensional data and a its model by looking only at the observations near the division between classifications.

Hiding the complexity doesn't mean we don't know that it's there. It means we don't need to see the complexity to accept the Gestalt of a system. From a "wikipedia article":http://en.wikipedia.org/wiki/Gestalt_psychology:

bq. Gestalt...is a theory of mind and brain positing that the operational principle of the brain is holistic, parallel, and analog, with self-organizing tendencies, or that the whole is different from the sum of its parts. The Gestalt effect refers to the form-forming capability of our senses, particularly with respect to the visual recognition of figures and whole forms instead of just a collection of simple lines and curves.

Whether this model of psychology is accurate is beyond debate: it isn't. There is, of course, more going on. It's a model, and is subject to the same constraints any model I'd create would have. But it embodies the purpose of a system. It describes how the complex inputs get transformed into figures and forms in our minds.

Our systems should also embody the main purpose of a system. The purpose that is manifest from observations and results, not our desired purpose for a system. It is this kind of embodiment that shows us that our economic systems are machines for growth and our health care systems are to pit the wills of the strong against the weak.

So, shrinking, hiding, and embodying is a context for reducing a model. It is a context for working on classification systems. Learning to classify systems is a walkable path, one that I'm walking right now, one that I've been gladly walking for quite a while.

AJAX: Big Picture

Posted by David Richards Sun, 08 Nov 2009 07:35:00 GMT

I've been baffled by AJAX. Maybe I lack the imagination to work through visual problems and create interesting solutions. Maybe I get stuck on the details and don't ever get to see the big picture. Probably I get overwhelmed by the difficulty of all the exceptions a good designer needs to know. Certainly writing front-end code feels a lot like eighth-grade composition: a lot of rules, and little outstanding progress.

So, I took a step back this morning and started to look at the big picture. I read code from books and projects I admire, comparing and contrasting what I saw. Here is what I saw:

Using any Javascript functionality requires a server-side foundation. For me, this is jQuery and the plugins I choose to use. It involves questions of caching, the order code should be loaded, and managing the paths where things are kept. It pays for me to slow down a little and say out loud what I'm doing here, asking and answering questions. That's because I haven't setup these things enough times to self-correct if I make a little mistake. It also helps when I hear myself say: I'm putting this feature in application.js because it really is a foundational piece that affects the whole site. That way, I don't feel like I'm building chaotic messes with my Javascript. I realize that this is one of the main blocks I have to this kind of work: I started learning it when it was a real mess, much worse than it is today, and so I don't get that balanced feeling that I get from well-written Ruby.

Good Javascript is unobtrusive. This reduces the pains of maintenance by a large exponent. I lay out the content as structured outlines, and I attach behavior and style to that outline with CSS patterns. This process can be iterative if I'm patient. It feels more like writing regular expressions than designing a class. I don't have to figure it all out at once, I can ask simple questions and come up with simple answers. The critical issues is whether I keep going, or allow myself to get flustered and impatient.

I have mostly ignored the interface between the client and the server in years past. These were details that web server architects had to deal with, not web developers. Most of my work was just serving up HTML in the early years, and I was content to follow recipes for delivering that content. I didn't bother much with headers, content types, formal interfaces, and that kind of thing. Writing good AJAX means I know a bit about the data being transferred, since it can be in several different formats. For instance, I'm aware of delivering AJAX as JSON, XML, HTML, or Javascript. All of these are valid and have their uses. On the server side, I need to make sure Rails understands what it's getting. This could mean setting the request header to "Accept", "text/javascript" so that Rails understands what kind of data is coming in. It could mean setting headers with

response.headers["Pragma"] = "no-cache"

or

response.headers["Cache-Control"] = "no-cache, no-store, max-age=0, must-revalidate"

without knowing the evolution of browser features that lead to this type of code. To me, this is all jibberish. I wouldn't know to consider no-store, no-cache, or any kind of age. I wouldn't know where to go to look up that kind of information. For all I know, that is a secret incantation from the deep magic of designers.

I've seen that there are simple ways of telling Javascript about model state. In Rails, we can call an ERB template that has script tags with javascript variables declared. Then, it can include a set of events and functions that a particular page will need to function properly. Those functions can depend on the values of those variables, changing the behavior of the page dramatically.

I see that there are many ways to solve a particular problem. I don't read the right kinds of blogs to hear about the relative strengths of each one. I wasn't introduced to this kind of code the way I was introduced to data modeling. I don't see grace in the making, only as a finished product, so I don't fully appreciate what it takes to actually create such things.

I think this article uncovers that I'm still grasping for a big picture when it comes to designing usable Javascript-driven sites. What little structure I have right now is:

  • Build a simple and solid foundation with the server. Make it jQuery + some plugins. Take the time to get it right.
  • Make my javascript unobtrusive. Write callbacks and CSS selectors in small steps. Build a pattern across the pages slowly, over time. Take the time to refactor my presentation organization, I can't know all that is happening at once.
  • Respect the work browsers are doing. Take the time to look at headers. Use Firebug's Net tab to see what's actually going across the wire. Compare sites that work with my own work to start to tease out the differences.
  • Build up simple feature sets per page. Don't try to architect this stuff from my station at this time. After a while, I'll see patterns and come up with my best practices. Don't get ahead of myself on this stuff. Just keep the code DRY, refactor, keep an audible dialog going so that I don't get ahead of myself.
  • Surround myself with better ideas. Instead of snacking on this kind of stuff every once in a while, find a steady diet of it. These are some of the resources I've found: "Scott Olson's blog":gscottolson.com/weblog/, "5 ways to make AJAX calls with jQuery":http://net.tutsplus.com/tutorials/javascript-ajax/5-ways-to-make-ajax-calls-with-jquery/, "the jQuery blog":http://blog.jquery.com/. This isn't good enough, but it's something.

Anyway, these are notes for the next person walking down a similar path, for what they're worth.

The Inexorable Push

Posted by David Richards Sun, 08 Nov 2009 07:01:00 GMT

I've heard of some of the perspectives successful authors have towards their work:

  • Terry Pratchet writes so many words a day, no matter what
  • Ayn Rand considers her page her employer, and her job to fill that page irrelevant to how she feels
  • Norman Mailer learns and teaches the craft of writing, then only falls back on it when passion falls short
  • Billy Collins writes poetry every morning, without judgment
  • Gerald Weinberg has a writing system for acknowledging his relation to his ideas, the Fieldstone Method
  • Natalie Goldberg makes statements and answers questions

I think about the author perspective because:

I want to write

I think technologists and authors have a lot in common

Let's forget that I want to write. That's a story for another day.

Technologists and authors have a lot in common. Both attempt to capture and express reality. Both often interface their world through keyboards. Neither one gets through this life without first getting approval from others: users, readers, managers, editors. Projects run through their lives, sometimes as express trains, sometimes as promenades. The projects often become the whole reality of both types of people.

And tonight, it's the blocks that are important. The writer's blocks, the slow downs in a developer's productivity.

I think I seek to be my own boss because I've figured out how to deal with these blocks that don't involve the kinds of activities employers like to put up with. For me, I have to be writing code to get out of the funk. So, if the assigned code is stuck:

  • I re-write what I wrote, looking for the essence of the system, somehow lost.
  • I write tests, getting that competent momentum slowly over a few hours.
  • I write open-source software.
  • I fill out mind maps, organizing thoughts, associating thoughts, asking questions, making statements, and creating a narrative for my blocked thought process.
  • I put a stagnant project aside and come back to it another day,

These are only my more productive reactions to a stuck mind. Checking email, playing games, taking walks, chatting it up with other people, playing Trance music, or watching Hulu are other ways I cope.

This is the only humane approach to software development that I know of. It would be nice to think that I never slowed down for anything. I don't know people that never slow down. I know one that comes close. He's really amazing, cranking out code about 16 hours a day. Maybe someday I'll figure out his secrets. For the rest of us mere mortals, however, the advice and metaphor above is probably the best I can offer today.

When I ask myself what I'd do if I had all the money in the world, I honestly answer that I'd be doing this: solving problems, building teams, writing code, learning, and finding myself awash in interesting things to do. Since this is what I want to do, it's important I figure out how I do it.

Dreyfus Model 1

Posted by David Richards Fri, 23 Oct 2009 18:34:00 GMT

I borrowed the idea of the Dreyfus Model from Andy Hunt’s Pragmatic Thinking and Learning. Since I keep asking people where they’re at on a particular skill on that model, I thought I’d outline it in a place where I can reference it. It’s probably good to know in general, because it tends to reduce frustration and focus learning styles by knowing what stage I am on each skill I need to have. The model as I remember it is basically:

  • Novice: looking for recipes
  • Advanced Beginner: able to work with the reference material nearby, not able to see the big picture
  • Competent: Mostly able to work without having to re-architect very often
  • Proficient: Able to work at a steady pace, able to self-correct when working
  • Expert: People are seeking you out in this field

So, for what it’s worth, there it is.

Older posts: 1 2 3 ... 10