Name fields are a metaphor for human diversity

Diversity is complex, because everyone is different. In this post I'm going to talk about what we learnt when we worked on a recent system storing the names of people from across the world.

News Opinion

First, I will use this analogy: If you’re able to walk, then you could look at how our offices are accessible to people in wheelchairs. We comply with all regulations and requirements and then some. But if you’d ever tried to get through the doors in here, then you’d know they’re a pain to get through in a wheelchair. It’s only if you use a wheelchair and try to get around that you realise the difficulties wheelchair users face.

As a developer of buildings, it can be handy to spend some time experiencing other people’s problems. As a developer of software, there’s an area that helps illustrate how tied we are to our own cultures, and how little awareness we have of others and the diversity of people.

Name fields!

To illustrate, here’s a real form, run on our site using the extremely popular Gravity Forms. You can use it right here, now, to sign up for our newsletter. If you want to stay up to date with our latest company news, you should probably subscribe 😉 :

Newsletter

I'd like to follow what's happening with interconnect/it so sign me up to the newsletter!

  • This field is for validation purposes and should be left unchanged.

What do you notice about it? Assuming you didn’t just scroll through this post, looked at subheadings, thought you’d reached the bottom and decided to go back to Facemonkey or Twitsonline or whatever, what’s wrong with it?

Nothing. You can see nothing wrong with it. I would imagine that almost nobody reading this article would see the problem.

Unless you try and complete it and happen to be a famous Indonesian weightlifter. He only has one name. Then what does he do? Where does he put his name? Of course, to be pedantic, his only name is his first name, and his last name. He could complete both sides of the form with “Triyatno” and we’d be fine. Except the system would report his name incorrectly in some cases by emailing him as Triyatno Triyatno.

In fact, if he moved to many countries, Triyatno would need to become Triyatno Triyatno, because many legal systems have no way of dealing with mononym cultures.

I’m entirely aware of mononym cultures, yet on our website I use a system that enforces a two name paradigm. Why?

Because all the systems we work with are designed this way. We cannot fix everything about the systems around us. We are culturally bound. Even when we know better. So only with systems we build ourselves, do we have the opportunity to break out.

Enter the Koreans

Koreans (and many others) have their names the other way round to Westerners. Who’s the most famous Korean around? I guess for some it’s the dictator of North Korea, Kim Il-Sung. Or maybe Gangnam Style singer Park Jae-sang. Their family names are Kim and Park respectively. Most mail merges would write to them as “Mr Il-Sung” and “Mr Jae-sang.”

I don’t like getting mails that say “Dear Mr David.” So I doubt they’d like it either.

It just seems so sloppy. And impersonal.

Technical cost

The awful reality is that coding software to take into account the vagaries of the human world is hard. Once we learn, we can deal with the differences in people we know and work with. If I meet a Korean, I know what to do about names. If an Indonesian gives me a single name, I can handle that. In Britain, you can call yourself whatever you like. It’s OK to be called ‘Prince’ and only ‘Prince.’ And this conveniently segues us into what happens if your name isn’t writable in Latin characters? How can our system handle Prince’s symbol?

Prince logo.svgIt can’t. Tough, yeah? Because Prince’s symbol will never be available in Unicode, as they won’t add symbols pertaining to individuals. Even Unicode’s vast tables of characters can’t handle the fullness of human diversity.

So here’s a non-exhaustive list of considerations about names:

  1. Not everyone has one.
  2. If they do, it may not be possible to represent the said name in your system. How’s your encoding?
  3. Don’t assume people always have a first name.
  4. Or a last name.
  5. Don’t assume the sort order of a family name. Does Van Halen come with the Hs or the Vs?
  6. Don’t assume people have prefixes.
  7. Or suffixes.
  8. Do allow for prefixes.
  9. And suffixes.
  10. Or is it better to skip prefixes and suffixes altogether?
  11. People change their names.
  12. In some countries you are entitled to be called what you want, regardless of birth certificates or any official paperwork.
  13. Some names are really really long.
  14. Names can at times include symbols, such as an exclamation.
  15. Sorting names is non-trivial and requires cultural knowledge.

Names, therefore are indicative of the complexity and range of edge cases in the huge diversity of humanity.

A systems answer, where possible

If you want to be ‘correct’ about it, you’ll write super complex systems that allow you to identify the cultural origins (and changing cultural positions) of the people whose data you’re storing. You’ll be able to deal with someone from Indonesia who now lives in Switzerland. Better still, when you’re reconciling two entries, one from a Swiss system and one from an Indonesian one, you’ll understand that “Tryatno Tryatno” is the same person as “Tryatno.”

This is, of course, one reason why machine learning is so popular. It can potentially handle the fuzziness of the human condition without having to reduce everything down to empirical and discrete conditions.

Yet… these machine learning tools will acquire the biases of their trainers. Just like the conventional logic based systems.

So when we write systems that try to reflect humanity, we have to think “What are we trying to show, and what is the purpose of what we’re coding?”

In a recent system we built, we realised that we were going to be storing the names of people from across the world including Koreans. The first name/last name paradigm wouldn’t work either, in case we had a senior Indonesian with one name. There is a simple solution, however. Here’s how it looks in the back-end, and everybody is happy:

You can put any name, using any Unicode supported character into the name field, and you can choose to sort it differently. So if someone’s name is ❤️  then that’s OK. We can store that. And to sort it? Well, we can use the text string of “Heart” in the Family Name field so that the person appears between G and I. Prince’s symbol, if we could store it as text (we can’t), could go in with the Ps and everyone would expect him there.

It saves trouble when dealing with someone from France called La Tour who expects to be with the Ts, but a British person called L’Estrange who expects to be with the Ls.

It also gives us a straightforward way of dealing with the problem that someone whose name starts with Å may end up being sorted at the end of the alphabet by many computer systems, behind the Zs. We even allow ourselves a way around dealing with diacritical marks which can have very specific sort orders in different languages. For example in French the order of Es with marks, would be é, è, ê, ë… yet in Italian the convention is to sort it the other way around.

So imagine, you’re writing a database system that has to display content to people across the world… they will look at a large amount of data on the screen and have expectations of the order in which things come. You really need to understand the cultural basis of the person looking. Are they in Italy? Or are they English, but based in Italy, in which case what do they expect?

And this is why it can be hard to write software for the whole world to use. It may feel like the internet is a borderless and boundless place, but in reality, it is bounded by culture and language. Even though the internet is the most sophisticated machine ever built by humankind, it fails at dealing with the complexity of diversity that the world can throw up.

It’s also actually why good software is never ‘done’. There are always more cases to fix, more diversity to deal with, more ways in which actions can be interpreted – and none are necessarily right or wrong. And playing at the 80:20 game just leaves you open to losing out a lot because everyone’s 80:20 of important is different.

Diversity failures are sometimes used to make good people feel bad. “Bwahaha! Your system incorrectly sorts Irish names starting with Ó! You’re rubbish! You don’t care about diversity even though you say you do!” It’s not actually an unusual thing to see, but often the accusers are people who couldn’t give two stuffs about diversity. They just enjoy bullying people regardless of the reason. Look at Twitter to see plenty of examples.

The complexity and ‘never done’ nature of satisfying diversity is also sometimes used as an excuse by some to bother less than they should. But truth is, it’s good to try. And if you think carefully, early on, then you can be inclusive without being crazy. Our handling of names in our custom built software is straightforward. We can even easily add another column to the system to give a preferred name, so that can be used for correspondence.

Yes, all that is harder than having a simple forename/surname approach, but it’s inclusive, it’s not that much harder, and it covers a huge range of possibilities. It also saves masses of contextual logic.

And don’t get me started on addresses…