User Input Is Often Like Water, Finding All The Cracks

Makers of software and technology services must walk a delicate balance between allowing users the freedom to enter their choice of input without allowing compromise of the system. Traditionally systems have assumed users have the best of intentions; this can lead to positive emergent behavior and growth. But as more business has moved on-line so have thieves and other malicious users. It’s no wonder that malicious input is now the number one threat on OWASP’s top 10 list.

Water seeks the path of least resistance as it flows, making rivers crooked. Likewise as the volume of user input increases it also seeps into more and more areas of weakness. And as the developers of services address these weaknesses it often adds complexity that bends and contorts their systems. E-mail software has become increasingly complicated because of it how it is used, misused, and creatively adapted.

At times the data itself becomes deformed to fit within whatever bounds cannot be broken, similar to water filling a form. Twitter‘s 140 character limit has led to or expanded creative use of text including: hash-tagging (‘#’), at symbol (‘@’) nicknames, URI shortening services, etc.

Despite the advantages of allowing liberal input my experience has been that it’s usually best to start strict and loosen up later. Trying to deny values or data that people have become accustomed to is a challenge. The push back may be too much to overcome, meaning the producers must live with that data forever or try to slowly deprecate it.

In extreme cases systems are like submarines in the deep sea which must withstand constant, destructive pressures. Without careful management and design user behavior and contributions can become the tail wagging the dog. This can be beneficial in some cases; though, it can also lead to unmaintainable expectations. An example in the larger software ecosystem is seen as the free software movement advances and some users (at times myself included) come to expect software at very low or zero cost despite the costs involved in their production.

What do you think?

Moving To Windows For Speech Recognition

Practical speech recognition options for non-Windows operating systems are few. Yet after years of overuse my hands needed a break from the keyboard and mouse. After a few months with SphinxKeys on Linux, some experiments with Simon Listens, and reading about the limitations of Dragon Dictate for Mac the only viable option was to return to Microsoft’s OS.

As a child in the early 1990’s I grew accustomed to Microsoft‘s DOS and consumer editions of Windows. College and a job at a very Apple-friendly company led to spending a lot more time with Linux, enterprise Windows, and OS X. As newer versions offered speech recognition and text-to-speech I toyed with these features like everything else. Sadly those brief trials left me with the impression that they were not ready for everyday use. Years later, typing and mousing around had caught up with me. In late 2013 there was no denying it was time to revisit speech recognition, and much more seriously.

By this point I was years into Linux and loving it: powerful shells, federated package management, light resource usage, lots of software choices … besides voice input. While Linux has several speech tools they all seemed impractical:

  • IBM’s ViaVoice was sold and died out
  • Palaver sends voice data through Google, incompatible with my job requirements
  • Platypus didn’t work with my version of Dragon
  • Simon Listens was cumbersome and never worked for me
  • SphinxKeys only simulated keystroke input
  • Vedics didn’t compile and seems out-of-date

There are more options on Linux. Though, after trying so many I had already found more success with Windows.

Around this time an old copy of Dragon NaturallySpeaking (circa 2007) turned up at a local thrift shop. Spending some time with it revealed how useful the different modes were, showed the promise of the software development kit, and piqued my curiosity into the tools others had built on it. Sadly it didn’t support 64-bit and integration into existing software was very limited. Apart from Microsoft Office it didn’t have a lot to offer out of the box. Reviews of later versions seemed to reaffirm that the software wasn’t going to work for my needs.

Microsoft began offering Windows Speech Recognition with Windows Vista. And after using it for a few months on Windows 7 I can say it does a passable job with a good, properly configured microphone. Integration with built software like Internet Explorer and Windows Live Mail is solid. Other applications like Miranda IM work reasonably well too. Too bad most fall back to the annoying, if usable, dictation pad. Patience and persist help in the hunt for the most practical solutions.

WSR can be resource intensive. My computer’s memory usage climbs a bit. Things also get slower as I keep many programs open. Using a lot of tabs in IE or Firefox caused the most slowdown; making scrolling a chore. Underpowered computers like netbooks, Celeron-equipped laptops, or older desktops only served to disappointed. Your mileage may vary.

While WSR works alright as is it really needs customization options to fit a wider variety of workflows. There are a few tools out there:

The first two also offer versions that work with Nuance’s Dragon products which helps avoid lock in. At this point I’ve settled for WSR Macros with some AutoIt tweaks to get voice clicking without the mouse grid and other things.

Today my voice does about 15% of the work. It helps most with e-mail, instant messaging, blogging, clicking, and window management. After seeing Tavis Rudd‘s presentation on programming by voice I hope to achieve a similar proficiency. Until then the experiments will continue as time permits.

Have you ever tried speech recognition? What did you think? If you’d like to share please comment.

Fake It And They Will Come

A disturbing trend among new, often social, services is to create fake users and content in order to give the impression they are active and popular.

First impressions are not everything, but they can set expectations for both those producing a service and those consuming it. If a service such as Reddit gains traction because they followed the fake-it-until-you-make-it philosophy then what’s to stop them from again choosing dishonesty when faced with other ethical dilemmas? There is also the hypocrisy factor when services demand complete honesty in their terms of service.

Imagine this ‘seeding’ practice becomes so well known among users that it’s expected of all new sites? Consumers may become increasingly cynical and untrustworthy of unfamiliar offerings. It could provide more security for established services at the expense of younger ones. Which would be ironic for those services which became entrenched by dishonest seeding.

There are plenty of other ways to establish a user base without resorting to lying. Offering consumers incentives for sign up and participation may also be considered shady since it is not purely natural behavior. Yet doing so offers a clear benefit for both parties without pretense. When content or users are faked the legitimate users gain only a false sense of the community or service.

What do you think? Have you encountered a service or site that relied upon falsified content? If so please consider commenting.

Please Don’t Use Vanity Versioning

As the version numbers of software and services have crept into the public conscience, the influence of marketing has moved into the numbering process. When version identifiers no longer communicate anything besides the passage of time or marketing campaigns they are just vanity numbers.

Identifying versions of software and services can be tricky business with increasingly longer strings of digits, letters, and punctuation. Consider a version such as “1.15.5.6ubuntu4”. Ubuntu or Debian package maintainers may feel right at home, but even software engineers like myself can get lost after the first or second dot.

Software versions often begin innocently enough: “1” being the first official version. Decimal digits afterward indicating incremental change. Changes to the significant digit were often significant, noteworthy changes in the software behavior, capability, and/or compatibility. Sadly I fear the marketing hype that accompanied the Web 2.0 movement and Google’s Chrome browser have increased the popularity of vanity version numbering.

Sequels to movies and games are common, and when you see a number next to the title it provides instant context. You know that there may be some back-story, content, or previous experience awaiting as you encounter the 3rd or 4th release of unfamiliar franchises. While numbers have fallen out of fashion in film and games, replaced with secondary titles, they served their role well enough. And releases within a franchise like Ironman are more individual products than versions of a single application like Internet Explorer.

Regardless, a user seeing “Opera version 15″ might not realize that upgrading from 12 means more than just the usual “better than before”. Version 15 saw Opera radically change in how it displays pages, handles e-mail, and the add-on capabilities available. This release was clearly introducing breaking changes; something I consider the most important thing versions should communicate. Yet their pattern before version 15 led users to believe the first number was not so significant.

Version numbers, or technically identifiers since they’re not always strictly numeric, can communicate a lot of different things:

  • Compatibility and incompatibility
  • Addition or loss of capabilities
  • Tweaks or fixes
  • Package information
  • Passage of time
  • Progress toward the first release (like 0.9 for 90%)
  • Revisions internal to the project
  • Start of a new marking campaign

I’d argue that compatibility, capabilities, and fixes are the most important; prioritized in that order. Which is why I think the Semantic Versioning concept is necessary. Despite being designed for Application Programming Interfaces, the behind-the-scenes components that make the ‘cloud’ and software tick, it is sorely needed in user-facing products like Internet browsers too.

Semantic Versioning’s goal of clearly communicating to machines and programmers can also help users understand potential consequences once they know the pattern. And it can be done easily, succinctly, and before they actually choose to update or install.

Ironically it’s the API’s which users do not see that tend to be the most semantic or consistent, at least in their end-point URI’s. These often contain the major version number clearly embedded like “v1” in “api.example.com/v1/”. My experience developing API’s has been that only the major version should be embedded in the URI, but minor and patch fragments can be useful for informational purposes.

One interesting hybrid scheme is Java. It’s major and minor version numbers are semantic with major technically remaining at ‘1’. And at least up until version 8, the latest as of this writing, it has remained largely backward compatible with the first official release. The minor version increases as capabilities are added: 1.1, 1.2, 1.3 … 1.8. Yet since 1.2 it has been marketed using only the minor number. Articles referring to “Java 7” or “Java 8” are technically referring to 1.7 or 1.8 respectively. Sadly the patch (a.k.a. update) version for Oracle’s official releases have gotten complicated.

If you are one of those privileged individuals choosing a version number please don’t get caught up in the hype. Let the needs of your users determine what is appropriate. And as I have the opportunity I’ll try to do the same.

Have you chosen a version identifier? Do you have some thoughts to share? If so please consider commenting.

Migrating From Apple OS X To Linux

Making the move to Linux from OS X was surprisingly easy in 2007. Being a software engineer with some limited Linux experience certainly helped. Choosing a user-friendly distribution with obvious customization tools, Kubuntu in this case, also facilitated the migration.

Before taking the plunge I had unknowingly worked with Linux servers in college. Later some coworkers had installed it on some computers I was trying to refurbish. And the experience of trying to remove Red Hat Linux reminded me of trying to purge a virus. (It had been taken hold of the computer disk’s boot record and boot partition; both foreign concepts to me.)

Yet with time I became strangely drawn to the idea of a free OS that was immune to most computer viruses. So I tried dual booting both Windows 98 and Red Hat Linux 9 on my laptop for a while. Though it was mostly just an experiment since it didn’t run most of the computer games software necessary for school that I wanted to use.

School and work had also introduced me to the world of Apple’s OS X. While the one-button mouse was an annoying limitation, almost everything else about it was easy and intuitively obvious despite years with Microsoft’s products. Having used only OS X at the office for a few years made it feel more and more like home.

After reading about Ubuntu’s goal of becoming a more user-friendly distribution of Linux I decided to try using it as a work desktop instead. Kubuntu’s Windows-like layout felt familiar with it’s task-bar, start menu, and default shortcuts. Although, OS X’s Command key serving as the modifier key for application switcher, copying, pasting, and quitting became second nature. Thankfully Kubuntu’s KDE interface allowed me to use my keyboard’s Command key (called ‘meta’ in Linux) to serve most of the same functions as it had with OS X.

At the time Kubuntu’s applications got the job done, if awkwardly. K-mail in particular was awkward compared to Apple Mail. Sadly the next best alternative, Evolution, crashed too frequently for comfort. Despite the lack of slick integration and polish, applications worked well enough:

  • Firefox was more or less the same
  • Java worked similarly and with fewer quirks
  • K-mail had GPG encryption support
  • OpenOffice ran better
  • Plenty of terminals were available
  • Subversion worked fine

Over the years I upgraded with releases, tried a few betas (not a good idea for one’s primary workstation), and moved on to Gnome-based Ubuntu and XFCE with Xubuntu. Quirks along the way included post-upgrade problems requiring re-installation, hardware incompatibilities, software incompatibilities, and differences in hot-key configuration.

Ubuntu distributions have also increased system requirements over the years. What was once nice and snappy on a 2004-era desktop with only 650 MB of RAM became almost unusable by 2009. Ubuntu’s move to the Unity desktop has also played a role. Despite its similarity to OS X I found Gnome-2-esque or Windows-like desktop environments more comfortable.

Looking back after seven years I’m mostly satisfied with Linux’s performance as a desktop OS. But at times it certainly required persistence and willingness to learn the terminal to resolve quirks. In recent years terminal usage has been less and less necessary. Hardware vendor support has also made it more practical.

For the sake of user freedoms I hope it can someday satisfy all desktop users; though, as a software producer I have doubts about the impact mature and free software like Linux will have on the labor market and price expectations.

Have you tried–or considered–Linux on your desktop? If you’ve something to share please consider leaving a comment.