Retrospective of making BGM player in my office

FYI: BGM means Background Music.

Idea comes from

Our office have enjoyed playing music at all times, it makes to focus and obsess over the quality of the work. There was a bluetooth speaker in the back-office team and a speaker in the dev team, playing music using YouTube, or such online music streaming services through bluetooth paired computers. Everything was good, but I found out that there were some problems that I could solve:

People saying that "I requested for ***".

Not all people are satisfied with the playlist, because of this, people sometimes request songs to the slack channel named #_bgm, which means someone has to play this songs when the song currently playing is ended.
At least one person must be responsible for playing, and managing playlist. we call them either "designated person" or "player" in this article from here.
Sometimes people can't pay attention because of the volume, so they objected to loud noise then the designated person turns down the volume.
The designated people forget to play music once in a while when they are busy. This also applies to some times they have to react to requested music or etc.
People want to know what song is playing right now.
And people want to know who the designated player is.

Idea

So I thought that I would be able to solve these problems in a simple way. I named it "bgmplayer", and the bot "bgmnyan" (브금냥 in Korean). I knew that there was a project so called youtube-dl that can download youtube streams like video and audio, and a project so called ffmpeg that can sanitize streams into normalized audios. I made a working prototype with these projects, and used Slack integration in order to expedite the development process. And look at the progress I had made in one day:

says "<@user> added '김동률-감사' to the playlist. (waiting 4 musics queued)".

Search results displayed using `/bgm if i cant have you`

The search displayed when people type `/bgm search keyword` in the bgm channel, and when you click on the search result it would add it to the playlist. So far this is so simple, isn't it? The next work I had to perform was very difficult, as it needs to ensure the playlist-order, and needs to consider how to play it. So I kept thinking about effective ways to play it easily, and also make it fast, then I reached the conclusion that I could make web-based music player. Because it was been a long while since I used frontend stuffs such as React, Redux, I needed time to study, but I decided to try it out even if it all goes wrong. And here is the screenshot of it I made for about 2-day working.

It was simple because there was a project react-player, a react components for playing multimedia contents, and because I had so many experiences of back-end development. So now I'm going to talk how it works, and what was hard to make it.

Stacks

Backend

fastify + fastify-websocket for web server
typeorm (w/ postgres) for ORM
redis for caching
ytdl-core for downloading youtube videos as audio sources only
bee-queue for running background tasks using multiple task runners
slack for slack integration
sentry for reporting errors, and simple event tracking

I picked fastify because I knew that it is rising open-source project and seemed nobody is maintaining the expressjs project since the last commit on May 26, 2019, and also there were a lot of people's attention on the fastify project. First of all, I made some slash commands provided by Slack integration such as /bgm [search keyword] (used to add to playlist), /playlist (used to check playlist), /bgmplayer (used for providing a link to the frontend with basic jwt credentials). It was simple and easy, I designed database models in the following way:

User, corresponding to a slack user.
Item, corresponding to a youtube video.
PlaylistItem, corresponding to a playlist item.

There is no such explicit thing like "Playlist" as PlaylistItem itself can be grouped into a single playlist. To accomplish this, I put the field named "nextPlaylistItemId" (unique key of the table) which explicitly refers to a next playlist item, and can be grouped using postgres' CTE statement, using WITH RECURSION, I could build a playlist very fast while indices are preferred, resulting in constructed playlist time taking less than 0.1ms at any time.

Made a method that fetches adjacent playlist items of a playlist item.

Next I had to ensure that the playlist item's order to be preserved, which means no conflicts at all, even if there is a modification like moving playlist item, or adding playlist item. The solution to the latter one is simple as it just had to bind "nextPlaylistItem" to the playlist item that does not have "nextPlaylistItem" which simply means "the last playlist item". The first one's approach was not easy as I thought because sometimes it goes between playlist items, sometimes it goes to the last, and sometimes it goes to the first. However I didn't want to spend a lot of times working on it, so I implemented and tested all the cases. So here is the implementation: first, get the adjacent (previous and next) playlist items of the playlist item I want to move and playlist item I want to put into, then update all fields of them in a transaction, of course I considered some cases like moving into the first of the playlist, and last of the playlist. After this work, the code looks like a a big bowl of 'update' soup, feeling not be very pleasant as well. But "it just works" as desired, so I finished this up in this state.

The next work was making APIs, filling in the code of send and receive, which was just manual work. Because I wanted to make the frontend client to be used not only by player, but by all people who wants to see the playlist without typing annoying and slow /playlist command in slack channels, I designed a websocket connection to have its own state such as user, channel, sessionId. The sessionId (created by clients using uuid v4, and stored in sessionStorage) is very important stuff here, because anyone – of course it has to be nominated to be a player – can be a player, I had to make sure that there is only one player in a single channel. I used redis for here, set the key like bgm:channels:${channel}:sessionId to that sessionId and persist it for some time while the client sends ping. Once the client connects, the server checks if there is a binded sessionId on the channel, if it does then compare it with the sessionId of current client, and inform the client that whether the client is player or viewer. As I mentioned above, players must be designated, so I made 3 states: admin (player), staff (player but viewer because another player is on hold on player), and viewer.

Building APIs were easy as it does not so differ compared to slack slash command controller's behavior. As I told, I designed the API to be communicated over a websocket connection, so I made it to communicate like RPC, because I didn't want to spend much time implementing sophisticated design, and also I wanted to see the results as soon as possible. Here is the API spec of the server, I shall say more about this later on Frontend section, the frontend have the same code in order to communicate:

Yup, I were almost done. The last thing I needed is just simply put role checking codes, approaching by:

Frontend

react (boilerplate from create-react-app w/ typescript)
mobx
react-player

Frontend works were considerably difficult because I've never had the experience of how it works nowadays. So I did hard study and after some time, I got used to a modern frontend design. There was a lot of changes since I used React (roughly 2 years ago), and redux was no longer a must-have for all frontend projects. Because, however, I had experience to use react at least, I wanted to use React for this project, and it seems MobX was really promising project for state management. MobX's concepts are so simple to describe: it just have observables, actions, and reactions. Observable represents the reactive-state of the app, Actions are how you mutate the observables, and Reactions means what it says. Because I didn't want to spend hard time to writing Redux codes such as initialState, action, reducer, mapStateToProps, and mapDispatchToProps, I decided to go with mobx for this project.

The frontend code is very simple, it has some stores like:

PageStore which is responsible for page stuffs such as toasts, and modals.
PlayerStore which is responsible for player states such as playing state, volume, progress.
CommonStore which is responsible for important stuffs such as token(given from /bgmplayer command), sessionId(uuid), role(given by server), playlist, isLoading, etc..., and CommonStore has a Communicator which is designed to communicate with the backend server.

And because this is a SPA(single page application), a lot of pieces of code are in few components:

Layout
FakePlayer: as its name implies, FakePlayer displays for viewers and non-player players.
Playlist: using react-beautiful-dnd (drag and drop)
PlaylistItem
SearchModal
SearchResultItem

Consisting of them, MainPage looks as follows:

As nowPlaying is given from the backend, CommonStore itself has a property that is named nowPlaying, this is updated when onEnded is called, and nowPlaying itself has a property that has a mobx's computed property decorator, which returns a streamable link pointing to the backend.

How does the streaming work

Client sends the server request with itemId(uuid) like `/items/${itemId}`, then server checks if the request comes from the client using headers like Referrer, then server sends a stream using bytes:

Hard work on frontend

Now I regret that I didn't read Common pitfalls & best practices of MobX. I didn't know about the section of:

Use @observer on all components that render @observables.
Dereference values as late as possible.

Because of these pitfalls, I didn't get to know why components won't render since I made an update to observable properties, and I thought that observable makes the component heavier, but the document says it doesn't.

One more thing was because of react-beautiful-dnd, I wanted to move the playlist item to where I want, but it gave me a wrong index in every conditions. I had read the official documentation several times but I just couldn't figure out what is wrong with my code. I temporarily fixed it with correcting index, but even at the present of writing, I don't know what I did wrong.

Audio Normalization

Because YouTube sources' volume may differ based on channels – FYI: YouTube itself already has an algorithm to avoid volume loudness –, I had to normalize downloaded audio's volume. This work was simple but hesitated to go, I just simply used ffmpeg-normalize for this case, and make the download task to trigger normalization process when it finishes. As a result, the volume is successfully normalized as I thought, and for now it's not a concern.

As you can see, there is a field that indicates the item is normalized or not.

WordCloud

Furthermore, I wanted to see what we have played today so far, I reached a goal to draw a WordCloud when we get off work. Because there are a lot of songs containing Korean, I had to use Korean morpheme analyzer for this work. This work was simple except artist keywords such as Taylor Swift, Lady gaga, etc which could not be detected using Korean morpheme analyzer. So I decided to make a simple word set containing artist names by crawling music streaming websites like Spotify. So I made it up, every 18:40 (we get off at 18:45):

Conclusion

This work solved the problems I found out:

When the backend server receives nowPlaying event, it notifies to the appropriate channel, this behavior can make people to know what song is playing now.

says "[Now playing] 내려간다 (Original) / Chait_ti (00:39, added by 한수훈)".

Add related videos automatically, which can make player to concentrate on their work.
People can know what songs will play next.

says "[Playlist] '바보 Unforgettable' has automatically been added to the playlist".

People can adjust the volume whenever the want via /volume command. This makes the player to have more free time while they doesn't care about the volume.

says "[Volume Adjustment] <@person> has adjusted the volume to 60%.".

Because the player has been dropped – means there is a public laptop that player works – they don't need to care about the playlist, or etc in future.
And most important, people now can hear songs what they wanted.

Furthermore

I am preparing to open source this project for other workspaces, where suffer the same issues.
I implemented a like event, which can be occurred when people adds :heart: emoji or something else. I believe this feature will help people to hear songs what they have liked, when server automatically adds to playlist.

Metrics

Using this small system, the bgmnyan has:

3,748 songs played
3,456 songs downloaded
1,219 songs normalized
8 likes got

I am so pleased that I was able to handle this case, and also make people to be more satisfied, and of course I am going to develop this project more advanced.