The Groups.io Engineering Philosophy

A thundering herd of wildebeest, Serengeti, Tanzania

But what a fool believes he sees
No wise man has the power to reason away
What seems to be
Is always better than nothing
And nothing at all keeps sending him…
What A Fool Believes, Doobie Brothers

Ask my kids and they will tell you, I am the party of no. No dessert before you eat your vegetables. No skipping out on 3rd grade to day trade stonks. No feeding your gremlin after midnight. I apply this principle to Groups.io devops as well.

One thing to note before we begin is that Groups.io is a small company with steady, predictable growth. For reference, we have about 60 Linodes of various sizes in our production cluster. Traffic is of the order of 10s of millions of page views and emails a day. Some of the following is not applicable or appropriate for fast growing or large services (but I think it is appropriate for most services).

KISS Is Not Just A Band Although KISS Is Also A Band

My guiding principle is:

Eschew recrementitious convolution

That means I have a thesaurus. It also means, keep things as simple as possible. The fewer technologies we use, the fewer things there are to go wrong.

When I’m getting paged at 3am because something is wrong with the site, I am not going to be at my sharpest and I’m going to be under a lot of pressure to get things fixed quickly. The simpler the system is, the easier it is going to be for me to figure out what’s wrong, and to fix it.

For us, this policy manifests in several choices for our tech stack. And that starts with the programming language.

Compile The World And Copy It Everywhere

Everything is written in Go. I am on record as being a huge Go fanboy. I will not extoll all the virtues of Go in this post, but I want to point out three important aspects of the language. First, it is not complex. There is no magic. This is important when trying to figure out what a piece of code you wrote in the last decade is doing. The second important aspect is that it is a compiled language. I believe in compiling all the things. The third important aspect is that it produces binaries that are (mostly) statically linked. Which means that it’s super easy to just copy them around as needed, without needing any support scaffolding.

In Which Our Villain Says No To A Bunch Of Useful Tech

Groups.io does not use containers. And that means no Kubernetes. The way we do releases is that we create a tarball of new executables, copy that to all the machines, untar them into a new directory, and then move symlinks to point to them. After that is completed, we restart the services using the new executables. Our production instances typically have uptimes in the hundreds of days.

We don’t do any autoscaling. I realize that means that we’re not fully utilizing our paid-for production capacity at all times, and I’m ok with that. I’ll take that tradeoff, for less complexity. We have a utility that uses the Linode API to spin up new machine instances and configure them when we need them. But because we have predictable growth, we have plenty of warning and can add machines at our leisure.

We don’t use DNS to access our machines internally. We use an SSH config file with ProxyCommands for each machine. Again, this is to keep things simple.

We have a monorepo, hosted by Github. We do not do automatic deployments on commit. We deploy all the time (including Friday afternoons!), it’s just the idea of automatic deployments makes me twitchy.

Structured Logs And JSON Configuration Files

We keep all our log files on the machines they were produced on. The way that Linode prices instances, we end up having a lot of extra, unused disk space on each instance, so we’re able to keep copious logs (see also above where we don’t autoscale). We have a program, unimaginatively called research, that will grep through appropriate log files on each machine, when we need to debug or trace an event. This works really well for us.

Configuration data is kept in JSON files, and is copied to each machine. We do not have a centralized configuration database.

Distributed Messaging Is Great

I don’t always say no. One core piece of our tech stack is NSQ, a distributed messaging system. It has been rock solid for us. Our large but not-exactly-a-monolith web server will off-load tasks that can be done in the background or that take a long time, to various microservices via NSQ. Some of these include uploading files to Amazon S3, generating group or user exports, sending notifications and webhooks, and sending emails.

One thing I wish NSQ had was a request response system. When interacting with some of our microservices, we need a response. For those services, we currently use HTTP JSON RPC calls. It would be interesting if instead we could just use NSQ in a way where we could receive a response (currently NSQ is basically fire-and-forget). That way, we could take advantage of NSQ’s distributed nature, which would make managing those microservices easier. I believe NATS may have something like this, although I haven’t investigated it.


Groups.io is the best tool to get a bunch of people organized and sharing knowledge. Start a free trial group today.

Why We’re Not Leaving The Cloud

Mars Camp, Wadi Rum, Jordan

You’re A Rich Girl, And You’ve Gone Too Far
‘Cause You Know It Don’t Matter Anyway
You Can Rely On The Old Man’s Money
You Can Rely On The Old Man’s Money
Rich Girl, Hall & Oates

David Heinemeier Hansson recently posted about Basecamp’s decision to leave Amazon AWS and move to self-hosting their own servers, Why We’re Leaving The Cloud. It’s a good post and you should check it out. Basecamp is spending over half a million dollars a year just on database and search hosting.

Renting computers is (mostly) a bad deal for medium-sized companies like ours with stable growth. The savings promised in reduced complexity never materialized. So we’re making our plans to leave.

David Heinemeier Hansson

David frames the question of hosting as a choice between two extremes. One the one hand, you buy and run your own servers. On the other hand, you use a full service cloud provider, like AWS, including their managed services, like RDS. But that’s a false dichotomy. There are plenty of options in the middle, both in terms of providers and in terms of how you utilize them. By going this middle path, you can get many of the advantages of the Cloud without incurring the added expense.

Not Just The Big Guys

When people think of the cloud, Amazon AWS immediately comes to mind. There’s also Microsoft’s Azure and Google Cloud. But there are other options as well, such as Rackspace, Digital Ocean, and Linode. At Groups.io, we have used Linode for over 8 years to great success. The reason that we use Linode instead of AWS is price. They cost less than Amazon or Microsoft or Google.

Typically these companies don’t offer as many services as the big guys. Linode is just now starting to offer a managed database service, for example. But that’s fine for us because we run our own database. And that brings up my second point. By managing our own database, we again save money. Basecamp is currently using Amazon’s expensive managed database offering. When they switch to their own servers, they will need to manage their own database anyways. So there’s no advantage to moving to their own servers over how we’re using the cloud.

But even with managed services, we come out ahead with Linode. We currently use Amazon S3 for a lot of our storage. But Linode recently started offering a compatible service that will allow us to cut our storage bill in half. So we will be moving to that soon.

Sure, But Renting Still Costs More Over Time

Yes, regardless of where you host, renting machines will cost more over time vs owning and hosting your own machines. It’s worth it for us, because even though, like Basecamp, we are a company with stable growth, it’s still important for us to be able to bring up new machines quickly and/or for short periods of time. Let me give you a recent example.

We use Postgres for our database, in a three machine cluster configuration. We needed to upgrade from the older version we were using to the latest version. For us, the best way to do that upgrade was to bring up three new database machines, copy the database over, test it, switch the site to the new cluster, and then retire the old cluster. With Linode, we were able to bring up three beefy new machines quickly and just as quickly retire the three old machines. If we were self hosting, we would have to purchase those new machines, at considerable expense. We would have ended up with three large machines sitting idle after the upgrade.

The Pain Of Owning Your Own Servers

I’ve had to run to the datacenter at 3AM to fix a broken machine. I’ve had to deal with datacenter (lack of) cooling issues causing hard drives to prematurely fail (thankfully SSDs moot this). I’ve had to deal with flaky server BIOSes. I’ve had to deal with running out of rack space in a datacenter that’s full. I’ve had to deal with negotiating co-location and bandwidth contracts. I am more than happy to pay more to never have to deal with any of that again. I do not consider hardware to be one of our core competencies, nor do I want it to be.

Decentralize All The Things

One final point that David makes is that many services are hosted on Amazon, and that is not a good thing:

It strikes me as downright tragic that this decentralized wonder of the world is now largely operating on computers owned by a handful of mega corporations. If one of the primary AWS regions go down, seemingly half the internet is offline along with it. This is not what DARPA designed!

I completely agree with David on this. And I like to think that we’re doing our part.


Groups.io is the best tool to get a bunch of people organized and sharing knowledge. Start a free trial group today.

Re: Introduction

Puppy by Jeff Koons, Bilbao, Spain

I’ve been one poor correspondent
And I’ve been too, too hard to find
But it doesn’t mean you ain’t been on my mind
Sister Golden Hair, America (the band)

After a long dormant period, I am reactivating the blog. I figure this would be a good time to re-introduce myself, and to do so, because I generally find these types of posts boring and self congratulatory, I will borrow a concept from Scalzi, the “fake interview.” Let’s begin.

So, dude, what’s with the domain name?

That’s how you want to start this? Ok, fine. A long time ago, I had the nickname of ‘pig’, because I put a picture of a pig on my resume.

That was dumb.

I mean, yeah, sure, now that you mention it and also, looking on 25 years later. In my defense, I did get the job, and made a bunch of friends.

I’m not liking your tone. Can we get back to the intro now?

Whatever, pig boy. Ok, besides that bad decision, who are you?

For the purposes of this blog, I’m a software engineer (I’m also a dad to 9 year old twins and husband to Suzanne). I’ve started three companies. I started ONElist, an email groups service, in 1997, and it was acquired by Yahoo in 2000 (where it was renamed Yahoo Groups). I started Bloglines, an online RSS aggregator before Google Reader was a thing, in 2003, and it was acquired by Ask Jeeves in 2005. I started Groups.io, an email groups service, in 2014, and I still run it. Also, you’re really mean.

Wait, you did the email groups service thing like 25 years ago, and you’re doing it again?

I am.

Out of ideas, eh?

Nope. I still think it’s a good idea and is something that should exist. Fortunately, many people agree with me. Also, it just passed its 8th anniversary. I believe it is a net positive in the world, and that’s important to me.

So, gonna take lots of VC money, hire a ton of people, grow the service and then pawn it off on an acquirer where it’ll then languish and die? I mean, that does seem to be your look.

Again, nope. I did that with ONElist, ’cause that was the style of the time. Bloglines never got to the point where I was ready to take funding before Ask came in with an acquisition offer. I am keeping Groups.io lean and focused, and I have no plans to take outside funding or to sell it.

You know, I didn’t think this interview would be so adversarial.

Hey, I’m trying my best to keep this from being boring, but you’re not giving me much to work with here.

Fair. How about we wrap this up?

So, what will you be talking about on this here flying pig blog?

I’ll be focusing on two things: the business aspects of running a non-VC funded Internet business from the viewpoint of a tech founder, and the technical aspects of running a site like Groups.io. Expect a post about every two to four weeks.

Don’t kill yourself with the posting frequency there, Mr Post Master. Why are you doing this?

The same reason anyone does a blog like this, marketing, of course. And while I’m here, please check out Groups.io for all your email groups needs. Email groups are a great way for groups of people large or small to stay in touch. We have a complete collaboration suite, including: group calendar, files, photos, and wiki.

Corporate shill. You probably didn’t even write this. I’m afraid to ask; what’s with the America song lyric at the top?

What can I say, I’m a fan of the smooth sounds of yacht rock. And yes, I wrote this.

Yacht Rock? Seriously?!? That’s it, I’m out.


Groups.io is the best tool to get a bunch of people organized and sharing knowledge. Start a free trial group today.

Notes on converting a repository to use Go Modules

Updated 5/16/19: Added a note to the Incremental Conversion section about using specific revs of a repository instead of replace directives.

Go modules is the new dependency management system for Go language based projects. I just spent the last three days converting the Groups.io codebase to use Go modules. It’s a large code base, and we had vendored ~75 dependencies over the past five years of development. The code base pre-dates any of the pre-existing dependency management systems, so we were flying a bit blind when it came to which version of each dependency we were using. There are several more comprehensive guides to converting to Go modules; this is just a set of my notes. I may update them as I get more experience with modules.

Incremental Conversion

I wanted to do the conversion incrementally, to give me more opportunity to test things and make sure I wasn’t breaking the build. What I did was create the empty go.mod file, and then within it, I added replace directives for each dependency we had vendored. The replace directive pointed into the vendor directory, like this:

github.com/jackc/pgx => ./vendor/github.com/jackc/pgx

Once I had done that, I started running go build. That would complain about missing go.mod files in each of the dependent directories. I would dutifully go in and create these missing go.mod files one by one. I continued doing this until there were no more errors.

NOTE: It’s been pointed out, here, that instead of adding a bunch of replace directives, I could have instead used the specific revision of the repository and go modules would handle it correctly. I didn’t know about this capability when I did the conversion.

Dependency Review

Now that I had a working build, I started the review process. I went through each dependency and, by seeing when we vendored it, I could review the upstream changes. For dependencies that had no changes I was uncomfortable with, I would then remove the replace directive in the go.mod file, letting Go manage that dependency.

Local Changes And Deleted Repos

We have made changes to several of these dependencies (that are not appropriate for upstream pull-requests), and we need to keep those changes. Also, a few dependences that we had vendored no longer exist, because their creators have deleted the repositories. I created a new top-level directory, internal. I then moved those dependencies from vendor into internal, and updated the relevant replace directive to point into internal.

Keeping A /vendor Directory

After all that work, we had a go.mod file with a large number of require directives, and about 14 replace directives. I removed the existing vendor directory from the repository. A downside quickly became apparent, however. Within Sublime Text, like most editors, I can hover over a type or function call and Sublime will list the various files where it may have been defined. This is very helpful during development. But without a vendor directory, Sublime couldn’t locate any of the moduled dependencies. It can be handy to have all the source code on hand/available to the editor when doing development. To generate a /vender directory from your go.mod file, use the command go mod vendor. To then use the /vendor directory for builds, add the parameter -mod=vendor when building.

Another advantage of keeping a /vendor directory is if you run the godoc tool locally. If you include the query parameter ?m=all when accessing the godoc webpage, it will include the /vendor directory (as well as the /internal directory, if you have one) when generating documentation.

go install gotcha

With go modules, the GOPATH environment variable is no longer needed. If you run go install without a GOPATH, it apparently does nothing, as it doesn’t know where to put the generated binary. To get go install to work again, I had to set the GOBIN environment variable.

Streaming Postgres Changes

Groups.io is based on a series of Postgres databases, which has worked very well for us. There are some scenarios, however, where it would be advantageous to have our data in a streaming log-based system, like Apache Kafka. One specific scenario involves ElasticSearch, which we use to provide full text search over group archives. Right now, when a message is added or updated, we send the update to Postgres, then we send the update to the ElasticSearch cluster. If we want to re-index our archives, we have to prevent new messages from coming in as well as changes to existing messages, while we do a table scan of the archives into a new ES cluster. This is non-optimal.

A better pattern would be as follows:

  • Additions/Changes/Deletions from Postgres get streamed into a log-based system.
  • A reader is constantly consuming those changes and updating the ES cluster.
  • When we want to re-index the site, we start a new reader which consumes the log from the beginning, creating a new ES index.
  • When the new ES index is up to date, simply switch the site over to it, stop the old reader and delete the old ES index.

There are other scenarios where having a log-based representation of the data would be useful as well. With this in mind, I’ve been researching ways to stream Postgres changes. These are my notes about what I’ve learned so far. They may be incomplete and contain errors. Corrections are appreciated.

Postgres introduced logical replication in version 9.4. With the addition of a plugin, it is now possible to stream changes from a Postgres database, in whatever format you prefer. There are a couple of projects that use this to stream Postgres into Kafka, like Bottled Water (no longer maintained) and Debezium. I could not get the Debezium Postgres plugin compiled on Centos 7. In addition, there’s a competitor to Kafka, called NATS, which, while not as mature as Kafka, has the advantage (to me) of being written in Go. There appears to be a connector between Postgres and NATS, but I haven’t explored it. Another related project is pg_warp, which allows you to stream Postgres changes to another Postgres database.

I wanted to explore exactly how a Postgres streaming system would work. While everything is documented, it was not clear to me how the process worked, should one want to implement their own logical replication system. The Postgres docs helped my understanding, along with this presentation, and this blog post from Simple. Also, playing with the wal2json plugin and following their README helped, and Debezium’s docs go into some detail as well. But there is more to it. This is what I’ve found out.

There are a couple of parts to a Postgres streaming system. You want to first get a complete snapshot of the existing database, and then you want to get all changes to the database going forward, without missing any changes should there be a hiccup (ie crash).

Here are the steps required (note: this may be updated as I gain more experience with this):

  • Set up Postgres for logical replication, and decide which plugin to use. All the plugin does is determine the format of the data you will receive.
  • Connect to the database using the streaming protocol. This means appending “replication=database” to the URL. The streaming protocol is not the same as the normal Postgres protocol, although you can use psql to send some commands. If you are programming in Go, the only Postgres driver that I found that supports the replication protocol is pgx. Unfortunately, one of the commands needed from it, CreateReplicationSlot(), does not return the name of the Snapshot created, which you need. I’ve submitted a pull-request with the change to return this information.
  • At the same time, connect to the database the normal way.
  • On the streaming connection, issue this command (using wal2json as the plugin for this example):
    CREATE_REPLICATION_SLOT test_slot LOGICAL wal2json;
  • This creates the replication slot, and it also generates a snapshot. The name of the snapshot is returned. Also, the consistent_point is returned.
  • On the normal connection, you can now take a snapshot of the existing database using the snapshot name returned above. Use these commands to initiate the transaction:
    BEGIN;
    SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
    SET TRANSACTION SNAPSHOT snapshot_name;
    SELECT * from ….;
    COMMIT;
  • Once the snapshot has been completed, on the streaming connection, issue:
    START_REPLICATION SLOT test_slot LOGICAL consistent_point;
  • That starts the streaming. You will receive data in the format the plugin outputs. One piece of data also returned is the WAL position. You can use this to resume streaming, should you have to restart your streaming system.
  • When you are done streaming, you must issue a DROP_REPLICATION_SLOT command on the streaming connection, otherwise Postgres will not be able remove old WAL files and you will eventually run out of disk space.

I am not completely sure that I’ve outlined the process correctly for when you have to restart streaming. I am also unclear about replication origins, but I think that’s only applicable if you’re replicating into another Postgres database. But I’m not sure.

Sony A7R II Eye AF Settings

I recently got a Sony A7R II camera, to replace my Panasonic GH2. I also got the 55mm prime and the 24-70mm zoom lenses. It’s a great camera, and I’m still learning the ins and outs of it. One really nice feature of the camera is EYE AF, the ability to focus automatically on someone’s eyes. It works surprisingly well, and is a great feature especially if you have young children (who aren’t so keen on staying in one spot for very long). You can tell when the camera has locked onto someone’s eyes because it’ll draw little green boxes around the eyes.

Eye AF is great, but the way it’s implemented is a little quirky. With the default configuration of the camera, you must first half press the shutter button to focus the camera (as you would normally), then you have to press and hold the center button of the control wheel to activate the Eye AF (while still half pressing the shutter button). This is awkward, to say the least. I have reconfigured my camera to make things easier. I now use back button focus, tied to the AEL button. Also, I have configured the AF/MF/AEL switch level to toggle between normal AF and Eye AF. So now, to focus the camera, I hold down the AEL button, and then use the shutter to take the photo. Depending on which position the AF/MF/AEL toggle is in, when focusing I’ll either be in normal AF or Eye AF. I don’t have to hold down two buttons at once for focus, and I can quickly switch autofocus modes.

It took me a bit to figure out how to configure the camera to do this, so here are the steps required (2.5 means go to the second tab in the Menu, 5th screen):

  • 2.5 AF w/shutter – Off
  • 2.6 AEL w/shutter – Auto
  • 2.7 Custom Key Settings
    • AEL Button – Eye AF
    • AF/MF Button – AF On

It’s also important to note that for Eye AF to work, the camera must be in AF-C (continuous autofocus) mode.

Also, here’s the Sony Help Guide for the A7RII.

Backup Strategy 2015

I just changed large parts of our family backup strategy, and looking back, it’s been 2 years since I last detailed what we use, so I thought it’d be a good time to revisit the topic. In our family, for computers, we have several Macs. For data, we have about 20 GB of personal and financial documents, and a little more than 200 GB of photos. I believe in having at least two backups, one of which must be offsite/in the cloud.

Previously, we used Dropbox and Boxcryptor to share our personal files, and the photos resided on a Synology DS412+ NAS. I was never comfortable having our personal information on Dropbox, even using Boxcryptor, which had the side effect of making things more cumbersome. Synology has a private Dropbox feature, called Cloud Station, and we’ve moved everything off of Dropbox onto it. It’s been problem free.

For backups, we continue to use Time Machine to back up our Macs to a Time Capsule, and Crashplan to back up our Macs to the cloud. This works fine. Previously, I had also used Crashplan on the Synology to backup our photos to the cloud and also to an external USB drive. This never worked well. Crashplan is not officially supported on Synology, so anyone wishing to use it has to rely on a third-party package. Every time Synology updated the operating system (which is fairly often), Crashplan would break. It would also break at other, random times. Finally I couldn’t get it to work anymore at all. As an aside, as part of my trouble-shooting, I learned that if there are problems, the Synology will mount the USB drive as read-only, but not make it apparent that it has done so. This backup system was just not working.

So, I threw out Crashplan and the external USB drive on the Synology, and replaced it with two things. First, I started using Synology’s package to do backups to Amazon Glacier, their cloud archiving service. It took about 5 days to backup 230 GB of data, and cost a little under $10. If I understand the billing correctly, it will continue to cost about $10/month to store that data, which I agree is somewhat expensive. And should I need to recover it, it will cost a lot more. But I consider the Glacier backup a disaster recovery backup only, and don’t anticipate ever having to recover from it. The Glacier backup is scheduled to run once a week.

The other thing I did was purchase a second Synology NAS (a DS415+, the next hardware rev of the DS412+), and set up nightly backups from the first Synology. I believe it’s an rsync-based system, and it provides multiple versions of files. It was painless to set up, and because it’s a local backup, it took a bit less than 2 hours to do a full backup.

So now, I have our data on 4 hard drives locally (each Synology has 2 drives in a RAID), as well as in the cloud. Additionally, our Macs are backed up to two different places, one of which is in the cloud.

Groups.io Update

It’s been almost six months since I launched Groups.io and four months since I’ve talked about it here on the blog, so I figured it was time for an update. I’ve been heads down working on new features and bug fixes. Here’s a short list of the major features added during that time:

Slack Member Sync

Mailing lists and chat, like peanut butter and chocolate, go great together. Do you have a Slack Team? You can now link it with your Groups.io group. Our new Slack Member Sync feature lets you synchronize your Slack and Groups.io member lists. When someone joins your Groups.io group, they will automatically get an invite to join your Slack Team. And when someone joins your Slack Team, they’ll automatically get added to your Groups.io group. You can configure the sync to be automatic or you can sync members by hand. Access the new member sync area from the Settings page for your group.

As an aside, another potentially great combination, bacon and chocolate, do not go great together. Trust us, we’ve tried.

Google Log-in

You can now log into Groups.io using Google. For new users, this allows them to skip the confirmation email step, making it quicker and easier to join your groups.

Markdown and Syntax Highlighting Support

You can now post messages using Markdown and emoji characters. And we support syntax highlighting of code snippets.

Archive Management Tools

The heart of a group is the message archive. And nobody likes a unorganized archive. We’ve added the ability to split and merge threads. Has a thread changed topics half way through? Split it into two threads. Or if two threads are talking about the same thing, you can merge them. You can also delete individual messages, and change the subject of threads.

Subgroups

Groups.io now supports subgroups. A subgroup is a group within another group. When viewing your group on the website, you can create a subgroup by clicking the ‘Subgroup’ tab on the left side. The email address of a subgroup is of the form parentgroup+subgroup@groups.io

Subgroups have all the functionality of normal groups, with one exception. To be a member of a subgroup, you must be a member of the parent group. A subgroup can be open to all members of the parent group, or it can be restricted. Archives can be viewable by members of the parent group, or they can be private to the members of the subgroup. Subgroups are listed on the group home page, or they can be completely hidden.

Calendar, Files and Wiki

Every group now has a dedicated full-featured Calendar, Files section, and Wiki.

In other news, we also started an Easy Group Transfer program, for people who wish to move their groups from Yahoo or Google over to Groups.io.

Email groups are all about community, and I’m pleased that the Beta group has developed into a valuable community, helping define new features and scope out bugs. I’m working to be as transparent as possible about the development of Groups.io through that group, and through a dedicated Trello board which catalogs requested features and bug reports. If you’re interested, please join and help shape the future of Groups.io!

Groups.io Database Design

Continuing to talk about the design of Groups.io, today I’ll talk about our database design.

Database Design

Groups.io is built on top of Postgresql. We use GORP to handle marshaling our database objects. We split our data over several separate databases. The databases are all currently running in one Postgresql instance, but this will allow us to easily split data over several physical databases as we scale up. A downside to this is that we end up having to manage more database connections now, and the code is more complicated, but we won’t have to change any code in the future when we split the databases over multiple machines (sharding is a whole other thing).

There are no joins in the Groups.io system and there are no foreign key constraints. We enforce constraints in an application layer. We did this for future scalability. It did require more work in the beginning and it remains to be seen if we engaged in an act of premature optimization. Every record in every table has a 64-bit integer primary key.

We have 3 database machines. DB01 is our main database machine. DB02 is a warm-standby, and DB03 is a hot-standby. We use wall-e to backup DB01’s database to S3. DB02 uses wall-e to pull its data from S3 to keep warm. All three machines also run Elasticsearch as part of a cluster. We run statistics on DB03.

Our data is segmented into the following main databases: userdb, archivedb, activitydb, deliverydb, integrationdb.

Userdb

The userdb contains user, group and subscription records. Subscriptions provide a mapping from users to groups, and we copy down several bits of information from users and groups into the subscription records, to make some processing easier. Here are some of the copied down columns:

GroupName string // Group.Name
Email string // User.Email
UserName string // User.UserName
FullName string // User.FullName
UserStatus uint8 // User.Status
Privacy uint8 // Group.Privacy

We maintain these columns in an application layer above the database. By duplicating this information in the subscription record, we greatly reduce the number of user and group record fetches we need to do throughout the system. These fields rarely change, so there’s not a large write penalty. There is definitely a memory penalty, with the expanded subscription record. But I figured that was a good trade off.

Archivedb

The archivedb stores everything related to message archives. The main tables are the thread table and the message table. We store every message in the message table, as raw compressed text, but before we insert each message, we strip out any attachments, and instead store them in Amazon’s S3. This reduces the average size of emails to a much more manageable level.

Activitydb

The activitydb stores activity logging records for each group.

Deliverydb

The deliverydb stores bounce information for users.

Integrationdb

The integrationdb stores information relating to the various integrations available in Groups.io

Search

We use Elasticsearch for our search, and our indexes mirror the Postgresql tables. We have a Group index, a Thread index and a Message index. I tried a couple Go Elasticsearch libraries and didn’t like any of them, so I wrote my own simple library to talk to our cluster.

Next Time

In future articles, I’ll talk about some aspects of the code itself. Are there any specific topics you’d like me to address? Please let me know.

Are you unhappy with Yahoo Groups or Google Groups? Or are you looking for an email groups service for your company? Please try Groups.io.