Project Soon

Hugo

Before I created this site I contemplated if I would do so at all. I have been using Discord for years, and it has been working fine. However, there are three reasons why I needed to move away from it. First being that Discord is a proprietary and centralized platform which can either become completely locked behind subscription or closed down at short time notice. Second is that while I invite only certain people to my server, I have heard some voiced opinion on invites from people I has of yet get to known close enough to feel comfortable to invite. Third is that Discord limits myself to short posts less than 500 characters, and I would like to write longer in certain instances to go deeper into certain subjects. This made me think it was time to separate discussion with content in Discord and create a dedicated blog for my thoughts. After asking around I found that Hugo1 was a great contender to be my SSG2 instead of the common CMS3 most blogs use. It can use several file types, including markdown, and also meta data, like tags, draft and date.

While installation went smooth with just a single command, running Hugo installation procedure resulted in a folder structure that seemed intuitive at first. I downloaded a theme and got cracking understanding the structure. I immediately began to struggle, both because the theme did not work as I anticipated, but also because the content tree I set up was not visualized as I anticipated. Therefore I studied how the content tree should look like and then I created my own theme that I could modify to my preferences. It does not look pretty, but at least it works, and I can easily modify it further if needed as I learn. Some mentioned that they would prefer a system where they only write content, but I like to have something versatile that I can adjust to my liking while I keep learning. I like to write, but learning is something I still strive for, to become better and more knowledgeable in order to share more.

For the content itself I wanted to transfer everything from Discord in its raw state. As this might only be a few hundred posts, I still felt it would be too time consuming and prone to error if I did it by hand, so a script was quickly thrown together to generate the markdown files with its necessary meta data. I even added so it would download any attached file and add it at the bottom of the post. As only one of them was not an image, I just used the default figure shortcode and manually modified that single file with a custom shortcode for downloading files. The initial script I used to get the data came from StackOverflow, but then modified to fit my usage. Currently I have to manually edit the file to change channel and tag, but it can easily be adjusted to support free parameters from cli.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
import requests
import json
import os
import re

# https://stackoverflow.com/a/71320367
def retrieve_messages(channelid, cat):
    num = 0
    limit = 10

    with open('.auth', 'tr') as f:
        auth = f.read().strip()

    headers = dict(authorization=auth)

    last_message_id = None

    messages = []

    while True:
        query_parameters = f'limit={limit}'
        if last_message_id is not None:
            query_parameters += f'&before={last_message_id}'

        r = requests.get(
            f'https://discord.com/api/v9/channels/{channelid}/messages?{query_parameters}',
            headers=headers
            )
        data = json.loads(r.text)
        if len(data) == 0:
            break

        for value in data:
            messages.append(value)
            last_message_id = value['id']
            num += 1

    print('number of messages we collected is',num)

    # Reverse to traverse properly
    messages.reverse()

    # Glue together posts
    posts = []
    for message in messages:
        if message['content'] and message['content'][0] in ['#', '*']:
            posts.append(message)
        else:
            if message['content']:
                posts[-1]['content'] += "\n" + message['content']
            posts[-1]['attachments'].extend(message['attachments'])
            if posts[-1]['edited_timestamp'] is None and message['edited_timestamp'] is not None:
                posts[-1]['edited_timestamp'] = message['edited_timestamp']
    print('number of posts we collected', len(posts))

    # Create post structure
    for post in posts:
        timestamp = post['timestamp']
        edited_timestamp = post['edited_timestamp']
        content = post['content'].replace("\n\n", "\n")
        lines = content.split("\n")
        title = lines[0].replace('#', '').replace('*', '').strip()
        slug = re.sub(r'[^a-z0-9\-_]', '', title.lower().replace(' ', '-'))
        # Create slug, carefully not to override existing post
        path = f'posts/{slug}'
        i = 1
        while os.path.exists(path):
            i += 1
            path = f'posts/{slug}-{i}'
        os.makedirs(path)
        attachments = []
        # Download attachments
        for attachment in post['attachments']:
            try:
                with requests.get(attachment['url'], stream=True) as r:
                    with open(f"{path}/{attachment['filename']}", 'wb') as f:
                        for chunk in r.iter_content(chunk_size=8192):
                            f.write(chunk)
            except Exception as e:
                print('Failed to download', attachment['filename'], e)
            else:
                attachments.append(attachment['filename'])
        # Create index
        # Using markdown, but could be changed
        with open(f'{path}/index.md', 'tw') as f:
            header = []
            # Using TOML, but can easily be changed to YAML or JSON if needed
            header.append("+++")
            header.append(f"title = '{title}'")
            header.append(f"date = {timestamp}")
            if edited_timestamp is not None:
                header.append(f"lastmod = {edited_timestamp}")
            header.append("draft = false")
            header.append(f"tags = ['{cat}']")
            header.append(f"categories = ['{cat}']")
            header.append("+++")
            f.write("\n".join(header))
            f.write("\n")
            f.write("\n\n".join(lines[1:])) # Double newline for an actual newline in markdown
            f.write("\n")
            for attachment in attachments:
                f.write("\n")
                f.write(f'{{{{< figure src="{attachment}" >}}}}')
                f.write("\n")

retrieve_messages('<channelid>', '<tag/category>')

As some of you might have noticed, I have not filtered out the content I copied over, nor have I adjusted it to fit this platform, but if I ever went back and done that, it would only be to make the posts consistent. I also plan to add more features to the blog, to make it easier to navigate. I probably will shuffle around some posts and sections, as I still feel like it is a bit difficult doing so in certain cases. The tags are also quite reserved, so it would be great to break it out into more possible keywords, and apply it as a meta for the page itself.

There are several more reasons why having my own dedicated site benefits more than having it on Discord: I can create drafts for all subjects I want to write about; set up a publishing structure so I can take my time to write and publish instead of writing from the top of my head; better quality in verifying the drafts before publish; more versatility to structure content, being it code, images or scripts. And probably many more that I just cannot think from the top of my head. The more I learn about Hugo, the easier and faster I can modify it to do my bidding, being it how markdown behaves, to how to display the content on each site, and I sure will enjoy it more as I use it.


  1. https://gohugo.io/ ↩︎

  2. https://en.wikipedia.org/wiki/Static_site_generator ↩︎

  3. https://en.wikipedia.org/wiki/Content_management_system ↩︎