This is the second part of a two-part blog post about some of our work on making it easier to deploy FixMyStreet and MapIt in new countries. This part describes how to generate KML files for every useful administrative and political boundary in OpenStreetMap.
The previous post on this subject described how to take the ID for a particular relation or way that represents a boundary in OpenStreetMap, and generate a KML file for it. That’s much of the work that we needed to create MapIt Global, but there are a few more significant steps that were required:
Efficiently extracting boundaries en masse
The code I previously described for extracting a boundary from OpenStreetMap used a public Overpass API server. This is fine for occasional boundaries, but, given that there are hundreds of thousands of boundaries in OSM, ideally we don’t want to be hitting a public server that many times – it puts a large load on that server, and is extremely slow. As an alternative, I tried parsing the OSM planet file with a SAX-based parser, but this also turned out to be very slow – multiple passes of the file were required to pick out the required nodes, ways and elements, and keeping the memory requirements down to something reasonable was tricky. (Using the PBF format would have helped a bit, but presented the same algorithmic problems.) Eventually, I decided that a better approach was simply to set up a local Overpass API server, and to query that. This is a great improvement – it allows very fast lookups of the data we need, and the query language is flexible enough to be able to retrieve huge sets of relations and ways that match the tags we’re interested in. It also means we would no longer have problems if connectivity to the remote server went down.
Another question that arose when scaling up the boundary extraction was, “Which set of tags we should consider as boundaries of interest?” On the first import, we considered any relation or way with the tag boundary=”adminstrative” and where the admin_level tag was one of “2″, “3″, “4″, … “11″. At the time, there were about 225,000 such elements that represented closed boundaries. Afterwards, it was pointed out to me that we should also include elements with the tag boundary=”political”, which includes parliamentary constituencies in the UK, for example. For later import purposes, I gave each of these boundary types a 3-letter code in MapIt, which are as follows:
- O02 (boundary="administrative", admin_level="2")
- O03 (boundary="administrative", admin_level="3")
- O11 (boundary="administrative", admin_level="11")
- OLC (boundary="political", political_division="linguistic_community")
- OIC (boundary="political", political_division="insular_council")
- OEC (boundary="political", political_division="euro_const")
- OCA (boundary="political", political_division="canton")
- OCL (boundary="political", political_division="circonscription_législative")
- OPC (boundary="political", political_division="parl_const")
- OCD (boundary="political", political_division="county_division")
- OWA (boundary="political", political_division="ward")
Importing Boundaries into MapIt
The next step in building our service was to take the 236,000 KML files generated by the previous step and import them into MapIt.
The code that creates the KML file for an element includes all its OpenStreetMap tags in the <ExtendedData/> section. On importing the KML into MapIt, there are only a few of those tags that we’re interested in – chiefly those that describe the alternative names of the area. We have to pick a canonical name for the boundary, which is currently done by taking the first of name, name:en and place_name that exists. If none of those exist, the area is given a default name of the form “Unknown name for (way|relation) with ID (element ID)”. There are also tags for the name of a country in different languages, which we also import into the database so that localized versions of the name of the boundaries will be available through MapIt with their ISO 639 language code.
Another tricky consideration when importing the data is how to deal with boundaries that have changed or disappeared since the previous import. MapIt has a concept of generations, so we could perfectly preserve the boundaries from the previous import as an earlier generation. This would certainly be desirable in one respect, since if someone is depending on the service they should be able to pick a generation that they have tested their application against, and then not have to worry about a boundary disappearing on the next import. However, with quarterly imports the size of our database would grow quite dramatically: I found that approximately 50% of the boundaries in MapIt Global had changed over the 5 months since the initial import. Our proposed compromise solution is that we will only keep the polygons associated with areas in the two most recent generations, and notify any known users of the service when a new generation is available for them to test and subsequently migrate to.
For reference, you can see the script that extracts boundaries and generates the KML files and the Django admin command for importing these files into MapIt.
The end result: MapIt Global
The aim of all this work was to create our now-launched web service, MapIt Global. As far as we know, this is the only API that allows you to take the latitude and longitude of any point on the globe, and look up the administrative and political boundaries in OpenStreetMap that surround that point. We hope it’s of use to anyone trying to build services that need to look up administrative boundaries – please let us know!
Photo credit: Hadrian’s Wall by Joe Dunckley,
One of the common elements you will find across mySociety’s sites is that they have features designed to reduce duplicate messages or reports being sent to politicians, governments or companies. We often do this in quite a subtle way, so it is worth spelling out here how we do this across several sites:
- If you start to report a broken street light or pothole on FixMyStreet, you’ll see local problems before you start to type in your own details. This means if the problem is already there, you can see before you waste any effort.
- If you use WhatDoTheyKnow to send a Freedom of Information request to a public body, we provide a facility which encourages users to search through other people’s requests before they type their new request in.
- If the 08:10 train you take to work is always late, when you go to report it on FixMyTransport, we show you all the other problems already reported on that route. If someone else has already set up a page, you can press the big green ’join’ button, and show your support.
- If more than a handful of people try to use WriteToThem to send an exact duplicate of the same message to a politician, it will prevent it. This is because we know that politicians listen much, much more to individual messages from constituents than bulk mails.
This pattern – trying to intervene before people write identical messages or reports – is a design decision that makes a big difference to the way these sites operate. As usual with mySociety sites, this little feature seems like the sort of thing that would be quite tempting to skip when building a copy. But it really matters to the long term success of the sites. There are three reasons why.
First, there is a simple public benefit that comes from saving time. There’s no point us wasting your time if a report or request has already been sent, especially around minor issues. Saving your users time makes them happier and more likely to enjoy their experience.
Second, if you can spot that someone is about to send a duplicate message, we may be able to encourage that user to support the existing report instead of making a new one. For example, on FixMyStreet you can add an update to an existing pothole report (“it’s getting worse!”).
This feature is most visible, and most mature, on FixMyTransport, where users are clearly encouraged to ‘support’ pre-existing reports, rather than making new copies. By discouraging duplicate reports, we let people with a shared problem work together, even if this only means adding themselves as a “supporter” and doing nothing else. We know that many people search for, and find, problem reports which have turned into these little campaigns, which they then join and help. So even if they are only reading them (not joining them) that exposure can have some value to the people affected. This would be diluted if we created lots of similar reports about the same problem.
Third, we discourage duplicates for the benefit of the governments and companies receiving messages. We don’t think FixMyStreet is effective because it lets people moan: we think it’s effective because it helps local government to be effective by giving them good quality reports about local problems, in formats that area easy to handle. This good quality reporting increases the chance that the government will understand the problem and act on it, which leads to our main goal – citizen empowerment. Recipients are unlikely to help users if many of the messages they get are confused, inaccurate or duplicates, so we work on all these fronts.
So if you haven’t thought about this before, notice how the “work flow” through our sites makes you see similar problems before you’ve finished reporting your own. This is the implicit way to prevent duplication. We don’t have “Stop! Warning! Check this is a new problem!” messages, because we never want to discourage genuine users. But the careful design of the interface gently discourages, successfully, duplicate reports, and encourages supporting of other items.
It’s never possible to entirely prevent duplication. But we try hard, because it’s always better to join people together around common causes, than it is to let them struggle alone.
When you report a problem on FixMyStreet.com, the site displays a map for you to click on to indicate its exact location. Of course, you can zoom in and out of that map, but when it is first displayed, FixMyStreet needs to use an initial ‘default’ zoom level. Ideally, this is a zoom level that reduces the number of clicks required before a user can pinpoint the location of their problem.
And here’s where we encounter a tricky problem. The world is a varied place – some towns are very dense with buildings and streets crammed close together. In these areas you need to default to a zoom that’s quite ‘close in’, otherwise it can be hard to locate your problem.
But out in the countryside, we have the opposite problem. You can have huge areas where there’s nothing but blank fields or moorlands. If the default map zoom is ‘close in’ here then users are likely to see a big map full of nothingness, or maybe just a single stretch of unidentifiable road.
So, what is to be done?
The answer is this – every time you search for a location in FixMyStreet the website does a check to see whether the location you typed is in an area where a lot of people live, or very few people live.
mySociety has been storing this population density data in a webservice which we call Gaze. If the area you searched for is in a densely populated area we assume that it must be an urban location, and the map starts with a helpfully zoomed-in map. But if you’re in a sparsely populated area then it’s probably rural, so FixMyStreet starts zoomed out, making it easier to get an overview of the whole area.
Where do we get the data from? Our late colleague Chris discusses this in a blog post from 2005 — the short answer is NASA SEDAC and LandScan. It’s an interesting example of how unexpected things can happen when data is made public — if population density wasn’t available to us, we wouldn’t have been able to put this small but clever detail into FixMyStreet’s interface.
Simple things are the most easily overlooked. Two examples: a magician taking a wand out of his pocket (see? so simple that maybe you’ve never thought about why it wasn’t on the table at the start), or the home page on www.fixmystreet.com.
The first thing FixMyStreet asks for is a location. That’s so simple most people don’t think about it; but it doesn’t need to be that way. In fact, a lot of services like this would begin with a login form (“who are you?”) or a problem form (“what’s the problem you want to report?”). Well, we do it this way because we’ve learned from years of experience, experiment and, yes, mistakes.
We start off by giving you, the user, an easy problem (“where are you?”) that doesn’t offer any barrier to entry. Obviously, we’re very generous as to how you can describe that location (although that’s a different topic for another blog post). The point is we’re not asking for accuracy, since as soon as we have the location we will show you a map, on which you can almost literally pinpoint the position of your problem (for example, a pothole). Pretty much everyone can get through that first stage — and this is important if we want people to use the service.
How important? Well, we know that when building a site like FixMyStreet, it’s easy to forget that nobody in the world really needs to report a pothole. They want to, certainly, but they don’t need to. If we make it hard for them, if we make it annoying, or difficult, or intrusive, then they’ll simply give up. Not only does that pothole not get reported, but those users probably won’t bother to try to use FixMyStreet ever again.
So, before you know it, by keeping it simple at the start, we’ve got your journey under way — you’re “in”, the site’s already helping you. It’s showing you a map (a pretty map, actually) of where your problem is. Of course we’ve made it as easy as possible for you to use that map (and, yes, we also let you skip it if for some reason a map doesn’t help you). You see other problems, already reported (maybe you’ll notice that your pothole is already there — and we haven’t wasted any of your time making you tell us about it, even though you probably didn’t realise we intended right from the start to show you other people’s problems before you reported your own). Meanwhile, behind the scenes, we now know which jurisdictions are responsible for the specific area, so the drop-down menu of categories you’re about to be invited to pick from will already be relevant for the council departments (for example) that your report will be going to.
And note that we still haven’t asked you who you are. We do need to know — we send your name and contact details to the council as part of your report — but you didn’t come to FixMyStreet to tell us who you are, you came first and foremost to report the problem. So we focus on the reporting, and when that is all done then, finally, we can do the identity checks.
Of course there’s a lot more to it than this, and it’s not just civic sites like ours that use such techniques (most modern commerce sites have realised the value of making it very easy to take your order before any other processing; many governmental websites have not). But we wanted to show you that if you want to build sites that people use, you should be as clever as a magician, and the secret to that is often keeping it simple — deceptively simple — on the outside.
This is the first part of a two-part blog post about some of our work on making it easier to deploy FixMyStreet and MapIt in new countries. This part describes how to generate KML for a given boundary in OpenStreetMap. Update: the second part is now available.
As mentioned in a previous post, we’re looking at ways of making it smoother to deploy FixMyStreet for a new country, city or use-case. Essentially there are two fundamental bits of data that you need for this:
- a mapping between a latitude and longitude (or postcode) to all the adminstrative areas that cover that point.
- a mapping between problem type and adminstrative areas to an appropriate email address for reporting that problem.
The first of those is typically provided by a service called MapIt, an open source GeoDjango web application written by Matthew Somerville. In the UK we are fortunate that official boundary data and the postcode database have now been released under an open data license from the Ordnance Survey. However, in other many other countries similar data is unavailable, or not available under reasonable licensing conditions. In such cases, though, all is not lost thanks to the extraordinary work of contributors to the OpenStreetMap project. OpenStreetMap contains high-quality administrative boundary data for many countries of the world, and we know that data submitted to the project is available under the Creative Commons Attribution-ShareAlike license, so we can reuse it in a web service like MapIt.
The first step towards being able to build an instance of MapIt based on OpenStreetMap boundary data is to be able to generate a shapefile that represents a boundary in the project. (In OpenStreetMap’s data model, boundaries are represented as either ways or relations, and those that we are interested in are tagged with boundary=administrative.) Matthew had previously written code to generate KML files for boundaries in Norway in order to help to set up an instance of MapIt for Norway, which is used by FiksGataMi. However, that script was quite specific to the organization of boundaries in that country, and it did not deal with more complex boundary topologies (e.g. enclaves), or different representations of boundaries (e.g. multiply nested relations).
So, Matthew and I wrote a new version of the code to extract a boundary from OpenStreetMap and generate a KML representation of it. The new version uses the Overpass API instead of XAPI, since it allows us to specify multiple predicates in the query and recursively fetch the ways and nodes that are contained in a relation. Once all the ways that make up a relation have been fetched (ignoring those with roles like “defaults” or “subarea”), the script tries to join each unclosed way to any other with which it shares an endpoint. We should end up with a series of closed polygons – the script exits in error if there are any unclosed ways left. We can then directly create KML from these polygons, the only subtlety being that we need to mark certain boundaries as being an inner boundary (i.e., creating a hole in a boundary) if they had the role “enclave” or “inner” in an OpenStreetMap relation. For example, the South Cambridgeshire District Council boundary has a Cambridge City Council-shaped hole in it:
Similarly, the script has to cope with multiple distinct polygons, such as the boundary of Orkney.
$ bin/boundaries.py --help Usage: boundaries.py [options] Options: -h, --help show this help message and exit --test Run all doctests in this file --relation=<RELATION_ID> Output KML for the OSM relation <RELATION_ID> --way=<WAY_ID> Output KML for the OSM way <WAY_ID>
For example, to generate a KML boundary for the Hottingen area of Zürich, you can do:
$ bin/boundaries.py --relation=1701449 > hottingen.kml
In the next blog post in this series, we will discuss extracting such boundaries en masse and creating a service based on them.
DIY mySociety is all about making our code – and our experience – available to people who want to build similar websites in their own countries. We thought it would be helpful to list some examples of sites already using mySociety code, so you can see the variety of different possible outcomes.
It might seem like a simple task, but identifying sites in this way isn’t as straightforward as you might think – we don’t always know when people pick up our open source code! If we’ve missed any, please do comment below and we’ll add them.
There are also many sites around the world which were directly, or indirectly, “inspired by” ours. In these cases, the site’s owners have written their own code from scratch. That’s a subject – and a list – for another post. For now, here are all the international sites using mySociety’s code that we know about.
Alaveteli: our Right-to-Know Platform
WhatDoTheyKnow.com – our original Freedom of Information site
FYI.org.nz – New Zealand Freedom of Information site
Pravodaznam – Bosnia and Herzegovina Freedom of Information site
Queremossaber.br – Brazil Freedom of Information site
Informatazyrtare.org – Albania Freedom of Information site
Tuderechoasaber.es – Spain Freedom of Information site
AskTheEU – Europe Freedom of Information site
FixMyStreet: our fault-reporting Platform
FixMyStreet.com – our original fault-reporting site
Fiksgatami – Norway FixMyStreet
FixOurCity – Chennai FixMyStreet
FixMyStreet.br – Brazil FixMyStreet, based on both our code and FixMyStreet.ca from Canada
Parliamentary monitoring and access to elected representatives
TheyWorkForYou – our original parliamentary monitoring site
WriteToThem – our original ‘contact your representative’ site
Mzalendo – Kenya parliamentary monitoring site
Open Australia – Australia parliamentary monitoring site
Kildare Street – Ireland parliamentary monitoring site
Parlamany – Egypt parliamentary monitoring site
Mejlis – Tunisia parliamentary monitoring site
Find out more about the Components behind these sites, PopIt and MapIt, on the Components mailing list.
A community of people, waiting to help
Inspired by the examples above? If you’re thinking of going ahead and building your own site, we’re here to support you with our easy-to-understand guidebooks and our friendly mailing lists (see links to the right). In our online communities you’ll find many of the people who built the sites listed here. There’s no-one better to ask questions, because they’ve been through the process themselves, from early conception right up to completion.
If you are one of those people who has been through the whole process of building, launching and running a site like these (with or without our codebase), and lived to tell the tale, please shout in the comments below. And especially if you’re open to people approaching with questions. Perhaps add a note to say where you prefer to have those conversations – whether that’s via your favourite mailing lists, Twitter, email or simply in the comments to this post.
One last thought – it’s interesting to see that our code can be used for areas as small as a single city (FixMyStreet Chennai) or as large as a confederation of states (AskTheEU.org). In short, it’s scalable! How will you use it?
Image by Windell Oskay, used with thanks under the Creative Commons licence.
We’ve already put a lot of work into making the code base for the FixMyStreet platform generic and country-neutral, but we’d like to make the process of setting up such a site easier than it is at present.
The first step in this project is going to be contacting as many people across the world as we can who have thought about trying to set up their own version of FixMyStreet, or who’ve actually tried.
We’ll be talking to people ranging from those who have active, running sites, to those who never got past the stage of thinking it might be a nice idea. We want to find out what things presented particular difficulties, and which of the next steps we’re considering would make the greatest difference to international adoption of FixMyStreet.
Some of the things we’re interested in, for example, include:
- Would you be interested in a hosted version of FixMyStreet, or do you prefer to set up a version locally?
- How difficult did you find the process of finding administrative boundaries for your country? Are the boundaries in OpenStreetMap good enough for your use? As a last resort, would you want to draw the boundaries manually?
- Would you be interested in an automated setup procedure which deploys a new server for your locality and then just requires web-based setup?
At mySociety, we’re working really hard to create software tools that are attractive and easy to set up in diverse countries, cities and regions. Now we want to make sure everyone knows what we offer, and how it can be useful. This is a beginners’ guide to what mySociety can offer in the way of software tools.
First up, the basics:
- All our code is open source.
- Some of our code is available in simple-to-use packages.
- There are two types of package. We call them Platforms and Components. This post is about explaining the difference.
You can think of Platforms as flat-pack websites – like furniture that arrives in a cardboard box, with all the screws, instructions and tools included. Our Platforms provide everything you need to replicate a site like FixMyStreet.com or WhatDoTheyKnow.com in your own country, city or region, but you need to do a little work to get it up and running.
Platforms are great for people who don’t want to spend a long time reinventing the wheel, and who want to get a basic, functional site up and running as fast as possible.
We provide the software, and you just need to add:
- Data to populate it
For example, if you’re setting up a website using the FixMyStreet platform, you need the names and email addresses of every bit of government that you want to send reports to. (This isn’t as daunting as it might sound – it might just be one authority and one email address! And if not, well, we’ve had lots of success with crowd-sourcing this sort of information).
- A server to host it on
We can help you here, if it’s a problem for you. See step 1 on this page.
- Enthusiastic people to run it
Don’t forget this vital consideration! Computers are great, but they can’t do everything themselves. You will need people – volunteers or paid staff – to promote, improve, and interact with the users of your website.
The following platforms are available to download and install:
For reporting common street problems such as potholes or broken streetlights. Creates transparency about local government, at the same time as providing a practical service to users.
Our Freedom-of-Information Platform. Whether or not your country has a Right to Know law, this Platform lets people ask questions to public authorities, – and it publishes all the conversations online.
If Platforms are like a flat-pack piece of furniture, Components are more like the parts of a kitchen. When you have a kitchen built, you get to choose from a number of parts that fit together: cupboards, drawers, shelves, etc. You can ignore things you don’t want, and add in things you do – and you end up with a kitchen that suits your needs.
Components will save you a lot of time because you won’t need to create database structures, APIs, search mechanisms, admin interfaces, and so on. Just slot in a Component – like you might slot in a dishwasher – and it’s all done for you. We’ve done our best to make them easy to deploy, easy to customise, and easy to connect together.
You will definitely need technical skills, although we are working on lowering that barrier. Components cannot run on their own – they need a website to fit into. And just as with our Platforms, you’ll need data. But you don’t need a server – we host the Components ourselves.
Right now we just have one component which is fully documented and ready to use, but we’re working on followups right now. This component is called MapIt.
MapIt is a web service which you can use to work out which boundaries a point or postcode exists within. An essential foundation for geographic lookups of all kinds. You can play with the UK instance here. We use it on:
- Our parliamentary monitoring website TheyWorkForYou.com. Users are shown their own MP’s data even if they don’t know who that MP is – all they have to do is input their postcode.
- Our ‘contact your representative’ site WritetoThem.com. Users input their postcode and are shown everyone who represents them, from local to European level.
- Our street problem-reporting site FixMyStreet.com. It sends problem reports to the relevant local council, based on the co-ordinates of where the problem was reported.
We are also working on a new component for building Parliamentary Monitoring Websites on top of, called PopIt. It isn’t quite ready for prime time yet, but if you join the Components email list, you can follow progress.
Where can I get these Platforms and Components ?
They’re all on Github, as is all our code (including a lot that we haven’t made easy to re-install yet). As it’s open source code, you can take them for free.
If you want to use MapIt, or learn about our future components, please sign up for the Components mailing list at the same time – it can be an invaluable place to get support when you have questions. You can also improve the code – sharing your improvements with us is a great way to say thank you. Plus, if you have ideas for other Components that will work well with ours, we’d love to hear about them.
We don’t just build this stuff, we also help people install and run it. Keep in touch and let us know how you’re using our code, and what is or is not working. If you hit any problems, there is always someone who can help.
FixMyStreet.com is mySociety’s popular British site for reporting problems like broken street lights and holes in the road. It works because as well as recording reports online, it sends copies to the relevent local governments. It has inspired many ‘grandchildren’ around the world.
Today marks the start of a new era for FixMyStreet as we push out the start of a major design upgrade in Britain, aimed particularly at making the mobile web experience as good as the desktop web experience.
Simultaneously, we’re also launching a guide to using the FixMyStreet Platform as the basis for your website in other countries.
- We’ve set up a new homepage for the FixMyStreet Platform.
- We’ve set up a new mailing list which you can join if you want to talk with us and with other users.
- We’ve published a brand new guide, suitable for technical and non-technical readers, about how and why you should consider using the FixMyStreet Platform to build your FixMyStreet-style website
We’re also here, waiting and ready to give you a hand. So if you’ve ever thought about setting up FixMyStreet outside Britain, there’s no better time to start than today.