Groups.io Update

It’s been almost six months since I launched Groups.io and four months since I’ve talked about it here on the blog, so I figured it was time for an update. I’ve been heads down working on new features and bug fixes. Here’s a short list of the major features added during that time:

Slack Member Sync

Mailing lists and chat, like peanut butter and chocolate, go great together. Do you have a Slack Team? You can now link it with your Groups.io group. Our new Slack Member Sync feature lets you synchronize your Slack and Groups.io member lists. When someone joins your Groups.io group, they will automatically get an invite to join your Slack Team. And when someone joins your Slack Team, they’ll automatically get added to your Groups.io group. You can configure the sync to be automatic or you can sync members by hand. Access the new member sync area from the Settings page for your group.

As an aside, another potentially great combination, bacon and chocolate, do not go great together. Trust us, we’ve tried.

Google Log-in

You can now log into Groups.io using Google. For new users, this allows them to skip the confirmation email step, making it quicker and easier to join your groups.

Markdown and Syntax Highlighting Support

You can now post messages using Markdown and emoji characters. And we support syntax highlighting of code snippets.

Archive Management Tools

The heart of a group is the message archive. And nobody likes a unorganized archive. We’ve added the ability to split and merge threads. Has a thread changed topics half way through? Split it into two threads. Or if two threads are talking about the same thing, you can merge them. You can also delete individual messages, and change the subject of threads.

Subgroups

Groups.io now supports subgroups. A subgroup is a group within another group. When viewing your group on the website, you can create a subgroup by clicking the ‘Subgroup’ tab on the left side. The email address of a subgroup is of the form parentgroup+subgroup@groups.io

Subgroups have all the functionality of normal groups, with one exception. To be a member of a subgroup, you must be a member of the parent group. A subgroup can be open to all members of the parent group, or it can be restricted. Archives can be viewable by members of the parent group, or they can be private to the members of the subgroup. Subgroups are listed on the group home page, or they can be completely hidden.

Calendar, Files and Wiki

Every group now has a dedicated full-featured Calendar, Files section, and Wiki.

In other news, we also started an Easy Group Transfer program, for people who wish to move their groups from Yahoo or Google over to Groups.io.

Email groups are all about community, and I’m pleased that the Beta group has developed into a valuable community, helping define new features and scope out bugs. I’m working to be as transparent as possible about the development of Groups.io through that group, and through a dedicated Trello board which catalogs requested features and bug reports. If you’re interested, please join and help shape the future of Groups.io!

Groups.io Database Design

Continuing to talk about the design of Groups.io, today I’ll talk about our database design.

Database Design

Groups.io is built on top of Postgresql. We use GORP to handle marshaling our database objects. We split our data over several separate databases. The databases are all currently running in one Postgresql instance, but this will allow us to easily split data over several physical databases as we scale up. A downside to this is that we end up having to manage more database connections now, and the code is more complicated, but we won’t have to change any code in the future when we split the databases over multiple machines (sharding is a whole other thing).

There are no joins in the Groups.io system and there are no foreign key constraints. We enforce constraints in an application layer. We did this for future scalability. It did require more work in the beginning and it remains to be seen if we engaged in an act of premature optimization. Every record in every table has a 64-bit integer primary key.

We have 3 database machines. DB01 is our main database machine. DB02 is a warm-standby, and DB03 is a hot-standby. We use wall-e to backup DB01’s database to S3. DB02 uses wall-e to pull its data from S3 to keep warm. All three machines also run Elasticsearch as part of a cluster. We run statistics on DB03.

Our data is segmented into the following main databases: userdb, archivedb, activitydb, deliverydb, integrationdb.

Userdb

The userdb contains user, group and subscription records. Subscriptions provide a mapping from users to groups, and we copy down several bits of information from users and groups into the subscription records, to make some processing easier. Here are some of the copied down columns:

GroupName string // Group.Name
Email string // User.Email
UserName string // User.UserName
FullName string // User.FullName
UserStatus uint8 // User.Status
Privacy uint8 // Group.Privacy

We maintain these columns in an application layer above the database. By duplicating this information in the subscription record, we greatly reduce the number of user and group record fetches we need to do throughout the system. These fields rarely change, so there’s not a large write penalty. There is definitely a memory penalty, with the expanded subscription record. But I figured that was a good trade off.

Archivedb

The archivedb stores everything related to message archives. The main tables are the thread table and the message table. We store every message in the message table, as raw compressed text, but before we insert each message, we strip out any attachments, and instead store them in Amazon’s S3. This reduces the average size of emails to a much more manageable level.

Activitydb

The activitydb stores activity logging records for each group.

Deliverydb

The deliverydb stores bounce information for users.

Integrationdb

The integrationdb stores information relating to the various integrations available in Groups.io

Search

We use Elasticsearch for our search, and our indexes mirror the Postgresql tables. We have a Group index, a Thread index and a Message index. I tried a couple Go Elasticsearch libraries and didn’t like any of them, so I wrote my own simple library to talk to our cluster.

Next Time

In future articles, I’ll talk about some aspects of the code itself. Are there any specific topics you’d like me to address? Please let me know.

Are you unhappy with Yahoo Groups or Google Groups? Or are you looking for an email groups service for your company? Please try Groups.io.

What Runs Groups.io

I always appreciate when people talk about how they’ve built a particular piece of software or a web service, so I thought I’d talk about some of the architecture choices I made when building Groups.io, my recently launched email groups service. This will be a multi-part series.

Go

One of the goals I had when I first started working on Groups.io was to use it as an opportunity to learn the new language Go. Groups.io is written completely in Go and is my first project in the language. As a diehard C programmer (ONElist was written in C, and Bloglines was written in C++), it took very little time to get up to speed on Go and I now consider myself a huge fan of the language. There are many reasons why I like to code in Go. It’s compiled, so it’s fast and you get all the code checks you miss from interpreted languages. It generates stand alone binaries, which is great for distributing to production machines. It’s got a great standard library. It’s easy to write multithreaded code (threads are called goroutines). The documentation system is good. But besides all that, the philosophy behind Go just fits my mental model better than any other language I’ve worked in. It all combines to make programming in Go the most fun I’ve had coding in a very long time.

Components

Groups.io consists of several components that interact with each other. All interactions are done using JSON over HTTP.

Web

The web server handles all web traffic, naturally. It is proxied behind nginx, because I believe that makes for a more flexible and slightly more secure system. Nginx terminates the encrypted HTTPS traffic and passes the unencrypted traffic to the web process. We use the standard Go HTML template system for our web templates, and we use several parts of the Gorilla web toolkit. We use Bootstrap for our HTML framework.

Smtpd

The smtpd daemon handles incoming SMTP traffic for the groups.io domain. It is also proxied behind nginx. The email it handles consists mainly of group messages, although there are some other messages as well, including bounce messages. It sends group and bounce messages to the messageserver for processing. Other messages are forwarded, using a set of rules, to other email addresses. We based smtpd heavily on Go-Guerrilla’s SMTPd.

Messageserver

The messageserver daemon processes group messages, bounce messages and email commands. For group messages, it verifies that the poster is subscribed and has permission to post to the group, it archives the message and sends it out to the group subscribers, using Karl to send the messages. It also sends the messages to our Elasticsearch cluster. Bounce and email command messages are processed as well. All group messages are processed through the messageserver, whether they arrive through the smtpd, or whether they were posted through the web site.

Karl

Karl, named after Karl ‘The Mailman’ Malone, is our email sending process. It is responsible for all emails originating from the groups.io domain. It is passed an email message, a footer template, a sender, and a set of data about each receiver the message should be sent to. For each receiver, it evaluates the template, inserting subscriber specific information, and then merges it with the email message before sending it out. It also handles DKIM signing of emails. It stores all emails using Google’s leveldb database until they are successfully sent.

A reasonable question to ask is why didn’t I outsource the email delivery part of the service. There are several companies that provide email delivery outsourcing. In general, outsourcing is a way to save development time. But when I thought about it, I did not think I’d be able to save much time by outsourcing; I’d still have to connect our data with whatever templating system the email delivery service used. And Karl did not take very long to write. But more importantly, email delivery is a core competency of our service and I believe we have to own that.

Errord

Errord is a simple logging process, used to log error messages and stack traces from any core dumps in any of the other processes. I can look at the errord log and instantly see if anything in the system has crashed and where it crashed.

Rsscrawler, Instagramcrawler

Rsscrawler and instagramcrawler are cronjobs that deal with the Feed and Instagram integrations, respectively. Rsscrawler looks for updates in feeds that are integrated with our groups, and Instagramcrawler does the same for instagram accounts. They’re currently run twice an hour. If they find an update, they generate a group message and pass it along to the messageserver.

Bouncer

Bouncer is a cronjob that is run once a day to manage bouncing users.

Expirethreads

Expirethreads is a cronjob that’s run twice an hour to expire threads that are tagged with hashtags that have an expiration.

Senddigests

Senddigests is a cronjob that’s run once a night, to generate digest emails for users with digest subscriptions.

Next Time

In future articles, I’ll talk about the machine cluster running Groups.io, the database design behind the service, and some aspects of the code itself. Are there any specific topics you’d like me to address? Please let me know.

Are you unhappy with Yahoo Groups or Google Groups? Or are you looking for an email groups service for your company? Please try Groups.io.

Introducing Groups.io

I’m not one to live in the past (well, except maybe for A-Team re-runs), but for many years now, I’ve felt like I’ve had unfinished business. I started the service ONElist in 1998. ONElist made it easy for people to create, manage, run and find email groups. As it grew over the next two and a half years, we expanded, changed our name to eGroups, and, in the summer of 2000, were acquired by Yahoo. The service was renamed Yahoo Groups, and I left the company to pursue other startups.

But really this story starts even further back, in the Winter of 1989, when in college I was introduced to mailing lists. I was instantly hooked. It was obvious that a mailing list was a great way to communicate with a group of people about a common interest. I started subscribing to lists dedicated to my favorite bands (’80’s Hair Metal, anyone?). I joined a list for a local running club. And, at every company I’ve worked at since graduating, there have been invaluable internal company mailing lists.

But that doesn’t mean that mailing lists can’t improve. And this is where we get back to the unfinished business. Because email groups (the modern version of mailing lists) have stagnated over the past decade. Yahoo Groups and Google Groups both exude the dank air of benign neglect. Google Groups hasn’t been updated in years, and some of Yahoo’s recent changes have actually made Yahoo Groups worse! And yet, millions of people put up with this uncertainty and neglect, because email groups are still one of the best ways to communicate with groups of people. And I have a plan to make them even better.

So today I’m launching Groups.io in beta, to bring email groups into the 21st Century. At launch, we have many features that those other services don’t have, including:

  • Integration with other services, including: Github, Google Hangouts, Dropbox, Instagram, Facebook Pages, and the ability to import Feeds into your groups.
  • Businesses and organizations can have their own private groups on their own subdomain.
  • Better archive organization, using hashtags.
  • Many more email delivery options.
  • The ability to mute threads or hashtags.
  • Fully searchable archives, including searching within attachments.

We’re just starting out; following the tradition of new startups everywhere, we’re in Beta. We’re working hard to squash the inevitable bugs and work to make the system even better (based on your feedback!).

I’m passionate about email groups. They are one of the very best things about the Internet and, with Groups.io, I’ve set out to make them even better. As John ‘Hannibal’ Smith, leader of the A-Team, liked to say, “I love it when a plan comes together.”

Yahoo Groups

I read with interest Marissa Mayer’s comments today at the Goldman Sachs Technology conference, specifically her mention of Yahoo Groups:

One of our strongholds has been Yahoo Groups, as it moves to the phone it opens up all kinds of possibilities. The phone is a much better place to do group communication.

My first startup was ONElist, which was renamed Yahoo Groups after we were acquired in August 2000. Over the past 12 plus years, I’ve watched as Yahoo did basically nothing with Groups. It’s still almost the same as when it was acquired. Yahoo has devoted only enough resources to keep it going all these years. In fact, if you try to use the site now, it often times out and is generally extremely sluggish. I don’t have current numbers, but I’ve been told that even with all the neglect, Groups still has over 100 million users. The group archives make up many petabytes of data. It is not a small service.

Email groups are great ways to communicate. As numerous people have told me over the years, Yahoo Groups have affected people’s lives in significant and profound ways. As my friends will attest, I’m at least as cynical as the next software engineer. But I think group communication is one of the most important aspects of the Internet and I truly believe that it has and continues to make the world a better, safer, more inclusive place. But Y! Groups has stagnated for 12 years.

Several months ago, I got fed up with the state of (neglect of) Groups and decided to start working on a next generation Groups service. It’s not ready yet, but it’s not too far out.

With all that, ever since Mayer took over as CEO, I’ve been watching for signs that she’d devote resources to Groups, and this is the first sign I’ve seen that they may be working on an update. They have a lot of challenges in doing so. With a service that hasn’t changed in 12 years, people have become accustomed to the interface and I believe there will be a lot of resistance from long time Groups users (which is the subject of an essay for another day). But I know that Groups can be so much more than what Y! Groups are right now. It’s only a matter of time. Whether Yahoo, or I, or someone else launches the next generation of groups, it will happen, and people will be better for it.

ONElist Office

Sam Rushing recently came across some old photos he took, including this one, which is a panorama of the old ONElist building in Redwood City. It was taken in February, 2000, which was after we merged with eGroups and before we were acquired by Yahoo (and became Yahoo Groups). The office was a converted warehouse and had about 50 people in it. This photo doesn’t show all the cubicles behind the photographer, nor does it show the offices underneath. During the whirlwind that was ONElist, to my lasting regret, I never took any photos, so I especially appreciate Sam’s rediscovery.

The cardboard cutout, btw, is Sarah Michelle Geller, during her Buffy The Vampire Days. I never knew the story behind why that cutout was in the office.

Stitched Panorama

Snap Groups

Now that I’ve gotten my blog back in order, I can (belatedly) announce the launch of my latest project, Snap Groups. It’s a return to communities for me, something I’ve been involved with since I wrote my first BBS system back in 1983 and continued up through the development of ONElist (now Yahoo Groups). Snap Groups is a new take on communities, combining elements of Facebook, Yahoo Groups, and real-time communication networks like Twitter. A great description of Snap Groups was written by Marshall Kirkpatrick on the RWW blog. I hope you will check out Snap Groups!

ONElist’s 10 Year Anniversary

I missed this by a couple of days, but it was 10 years ago when I launched ONElist (now Yahoo Groups). It was a Saturday
night, January 24, 1998, and I had just completed three months of coding the site, by working nights and weekends. I had never created
a web site before and one of the reasons I started ONElist was because I wanted to learn how to do so.

I talk about how I launched the site in the (woefully incomplete and unmaintained) ONElist files:

I wanted to start things slowly, so I decided to try to get one person to
start a list. I’d be able to shake out any remaining bugs and get feedback.
So I did a search of USENET looking for people who wanted to start mailing
lists but didn’t know how. I found one person, who happened to be in Norway,
and spammed him about the service. Then I went to bed. Little did I know
that this would be the last night of (non-alcohol induced) restful sleep for
the next couple of years.

The next morning, I was hoping that there’d be one new list created. Or
at least I hoped the guy from Norway didn’t complain about me spamming him.
Instead, to my surprise, there were about 20 lists created. The guy from
Norway had created his list and then told all his friends about it. And that’s
how it grew. You create a list and of course you want subscribers. So you
tell your friends. It snowballs. Viral marketing, the VCs call it.

So what was the first list? Discourse about Shakespearean influence in modern
playwrights? Talk about rising tensions in the Middle East? In depth political
discussion about globalization and free trade? No, no, no. It was about
lizards. Not just any kind of lizard, but Anole lizards. From a guy in Norway.
It’s still there, even: Discussion
list for all Anole species
. And most of the other new lists were lizard
lists. I suddenly had visions in my head about our first press release.
“Leaping lists of lizards!,” it would shout. Herpetologists rejoice!

From that, there was little stopping it. I occasionally
posted announcements
to USENET groups about ONElist, but the growth
really came from word of mouth. In hindsight, I guess it’s obvious that
mailing lists are viral. But at the time, I had no idea. I just wanted
to create a service that made finding and managing mailing lists easier.

Reading that back, it really doesn’t convey the shock and amazement I felt that Sunday morning when
I logged in and saw those 20 lists. It really was incredible to me.

I have a tendency to get caught up in things and
not fully appreciate what’s going on in my life at a given moment, whether it’s a relationship, a
vacation, or, in this case, a life-changing startup experience. It all went so fast. But I am extremely fortunate that
I remain friends with many of the people I worked with during that time.

ONElist was an amazing 3 years of my life. To this day, I still occasionally hear stories about how
one or another of the mailing lists has changed someone’s life. As an engineer, it’s incredibly
gratifying to have been involved in the creation of something that so many people use on a daily
basis. And to think, when I started it, I had no idea if anyone would use it.

I Got Paid To Blog

Recently I answered a posted issue on the Techdirt Insight Community service. As I blogged previously, the Insight Community system is a way for companies to ask questions and gain insight from a group of experts. I’m an investor and board member of Techdirt.
After registering for the service, I started receiving issue notifications. When one came up asking about designing a mobile RSS strategy, I jumped. Given my background with Bloglines, I had a definite point of view and very relevant experience in the area. I wrote up my thoughts and submitted my response. A couple weeks later, I was notified that my response had been accepted and that I would receive $500 for my work.
This is a great use of the Internet. The company was able to quickly get detailed, expert responses to a question it had. I was able to utilize my experience in a field and get paid doing so, from the comfort of my home.

Wesabe Raises Series A

Congrats to Marc and Jason and the rest of the gang. They announced today that Wesabe has raised $4M in a Series A round, led by Union Square Ventures. I was privileged to be able to participate in the seed round, and it’s been fun watching them grow so far.