Blog Post Generator (tutorial)

2026-01-19

I recently read a post by Tim Dettmers on how he uses AI to support his workflow. In case you don’t know of Tim, his work on quantization has been foundational to how AI has developed over the past few years, supporting experimentation on multiple levels. The post gives a couple examples of tasks that he found AI to be helpful for, which seem to revolve mainly around ideation. He also cataloged an area, like email, where it didn’t work. While the positive examples lacked some specifics, he did mention that he uses it for writing blog posts, including the one in question. This got me thinking about whether I could do something similar, so I spent a couple of hours coding a blog generator. This current post is essentially a tutorial of how something like that could be done.

As Tim notes, the primary usefulness of AI is in its ability to automate tasks. For LLMs, this refers specifically to tasks related to language. I have some ruminations about this in a video (I think AI should be an acronym for “Automated Inference” rather than artificial intelligence), but essentially what we would like from LLMs in many cases is the ability to write text for us, in various contexts. This means that we have to be clear about the output that we want and then we figure out how to configure the inputs in order to get there. During the process, we also need to be clear what kind of expense (in time, effort, etc) we are willing to bear, i.e. “is the potential time we save worth the effort”.

This kind of cost-benefit analysis is inherent in any automation endeavor. As Larry Wall (Perl) famously noted, laziness is one of the virtues of any programmer, since the effort you put in to create labor-saving programs pays off in time saved down the road. I already have coded a number of small programs that I use to automate various aspects of my work, and over the past few years they have saved me tens if not hundreds of hours. Rather than explaining how this does or doesn’t work in relation to a blog generation tool, I though I’d detail the process I followed, and you the reader can be the judge (perhaps with your own data) about its success (or failure).

The output

The output we’re looking for is a blog article that can be posted without much editing. Essentially something I would write about a given topic. While of course it won’t really sound like me, I would want whatever gets produced to be easy for me to adapt to my voice/style.

For this case, I will be generating a follow-up to a post that I wrote as a “Part 1” about LLMs, which focused on a description generator using keywords. The result of that experiment wasn’t great, but rather than focus on any developments to such a tool, the follow up post is actually about a different use-case, the identification of implict motive imagery in text.

The input

There are a few components that should inform the generated post, which (as someone familiar with the topic) I would consider in writing it.

  1. A set of sentences (bullet points) to guide the general flow of the post:
    • Title: Some notes on LLMs in real-world contexts (Part 2)
    • Summary of previous post
    • Description of using large language models to automate Implicit Motive Classification
    • Challenges of this approach
    • Benefits of this approach
    • etc..
  2. The original “Part 1” blog post (linked above), which gets summarized.
  3. Some other information and research papers (Brede et al, Nilsson et al) related to the topic.

Each of these inputs need to get processed in some way and incorporated into the prompt. To ensure that all this data is considered by the model, I’m using Phi-3-mini-128k-instruct for its long context window.

The process and code

The following sections walk step by step through the Python code for this process.

First we import the libraries we will need - mainly just transformers and some utilities.

import glob, time
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from transformers.pipelines.pt_utils import KeyDataset
from datasets import Dataset

Then we give the path of the model and setup the pipeline. Phi-3-mini is a small model from Microsoft that supports longer contexts (if you have the requisite memory) and actually does pretty decently at following instructions. Here I’m using a Huggingface version that has been quantized, which means it takes up much less memory, though this comes with a slight hit to performance. We can also instantiate some of our arguments for generation in a dict.

modelname = "PrunaAI/microsoft-Phi-3-mini-128k-instruct-bnb-4bit-smashed"

model = AutoModelForCausalLM.from_pretrained(
    modelname,
    device_map="auto",
    dtype="auto",
    trust_remote_code=False,
    )

tokenizer = AutoTokenizer.from_pretrained(modelname)

pipe = pipeline(
    "text-generation", # generative pipeline
    model=model,
    tokenizer=tokenizer,
    )

generation_args = {
    "return_full_text": False,
    "do_sample": False,
    "use_cache": True,
    }

Now we want a function that will generate our prompts for us. This is in case I want to generate multiple blog posts. We use f-strings to specify some values that might change.

def get_blog_prompt(points, consider, length):
    prompt = [
                {"role": "system", "content": f"You are an assistant that generates detailed blog posts based on user input. You provide comprehensive explanations and elaborate on topics with rich detail. Consider the following information from previous research while generating the content of the post: \n{consider}"},
                {"role": "user", "content": f"The following sentences are section headings for a {length} word blog post. Flesh out the ideas for each heading, generating two or more paragraphs that discuss the content of each heading in relation to information from previous research. \n{points}"},
              ]
    return prompt

At this point we can get the text files from a folder, read them, and store the resulting strings in tuples. This is so that we have context to feed to the language model to help it in following our instructions. One issue here is that if the input contains too many tokens (both words and punctuation), we will run out of memory. Each LLM also has a default prompt, and these can be extremely long, so if you have GPU memory constraints it can be tricky to figure out a sweet spot in terms of how many documents/tokens you can pass to the model. In my case, I found limiting the input to roughly 15-20k words allowed the Phi3 model to run fine on my hardware.

blogfile = "Some_notes_Part_2.txt" # this is the doc with the bulleted outline
infofiles = [x for x in glob.glob("previousposts/*.txt")] # these are the docs I want the model to consider, converted to txt

with open(blogfile) as f:
    blogsum = f.read()

infostring = "" # blank string
for inf in infofiles:
    with open(inf) as f:
        infostring += "Title: "+inf.split("/")[-1][:-4]+"\n\n" # assume that the filename is the title
        infostring += f.read()+"\n\n" # read in the contents of the doc and add it to our string

print(len(infostring.split())) # rough estimate of the number of tokens (does not include punctuation)

bloglist = [(blogfile, blogsum, infostring)]

Here is where the magic happens. We go through the list of tuples, and make these part of our prompt. Then we convert our prompts into a Dataset, which supports batching, and pass each prompt to the generator, which then feeds it to the model and gets back the relevant text. This takes about 1-2 minutes on my 24gb GPU. Then we can write the output to a file.

wordnum = 2000 # the number of words to aim for in a post

for points in bloglist:
    # store the name of the post
    name = points[0]
    # get the prompt for each post
    ndata = [get_blog_prompt(points[1], points[2], wordnum)]
    ntexts = {"prompts": ndata} # convert to dict
    dset = Dataset.from_dict(ntexts) # convert to dataset
    data_iterator = KeyDataset(dset, "prompts") # convert to iterable
    print("Generating post...")
    # Record the start time
    start_time = time.process_time()
    # generate the post using the pipeline and arguments
    generator = pipe(data_iterator, **generation_args, max_new_tokens=wordnum, batch_size=4)
    nsents = []
    for out in generator:
        result = out[0]['generated_text'] # get the output
        nsents.append(result) # add to our results list
        print(result) # view the output
        print(len(result.split())) # rough count of words
        # Record the end time
        end_time = time.process_time()
        # Calculate the elapsed time
        elapsed_time = end_time - start_time
        print(f"Process time: {elapsed_time} seconds")

    for num, x in enumerate(nsents):
        # write to a file
        with open(name[:-4]+"_output"+str(num+1)+".txt", "w") as f:
            f.write(x)

The result is actually surprisingly decent. Obviously your mileage will vary depending on your input and the model you’re using, but here is some output that I got. To be clear, I gave the headers to the model as separate sentences/points (via the file “Some_notes_Part_2.txt”), and the rest of the text was generated by the Phi-3 model. Here are a few examples of paragraphs generated by the tool:

Title: Some notes on LLMs in real-world contexts (Part 2)

Summary of previous post:

In the first part of this series, we explored the use of large language models (LLMs) like GPT-3 and BERT for text-based applications, such as generating descriptions from keywords and classifying implicit motives in text. We discussed the process of training LLMs for these tasks, including dataset selection, model fine-tuning, and performance evaluation. We also examined the challenges and benefits of using LLMs for these applications, as well as the success of recent models in comparison to previous ones.

Description of using large language models to automate Implicit Motive Classification:

In this section, we delve into the use of LLMs for classifying implicit motives in text, a task that has been traditionally performed by human experts. We discuss the process of training LLMs to classify implicit motives, including the selection of appropriate models, fine-tuning on relevant datasets, and the evaluation of model performance. We also highlight the recent success of our implicit motives classifier, which outperforms previous models on various metrics.

Challenges of this approach:

Despite the recent success of LLMs in classifying implicit motives, there are still significant challenges in using them for this task. One of the main challenges is the brittleness of LLMs, as they struggle to generalize to unseen data and may produce erroneous classifications. Additionally, LLMs may not capture the nuanced understanding of implicit motives that human experts possess, as they are limited to the information present in the text.

Benefits of this approach:

The use of LLMs for classifying implicit motives offers several benefits over traditional human-based approaches. Firstly, LLMs can process large amounts of text data quickly and accurately, making them well-suited for classifying implicit motives on a large scale. Secondly, LLMs can capture subtle semantic representations of text, allowing for the extraction of nuanced cues that correlate with implicit motives. This can lead to more accurate and reliable classifications.

Success of current model vs previous models:

Our current implicit motives classifier, implemented using the Electra-based model, has demonstrated superior performance compared to previous models. It has achieved high levels of accuracy and generalization, outperforming previous models on various metrics. This success can be attributed to the advancements in LLM technology and the availability of large-scale training data.

Not too bad - some minor issues that could be easily edited out. Now here’s the concluding paragraph, and you can probably already spot some larger issues with the text. At this point (i.e. toward the end), the model seems to be caught in a bit of a loop, and is repeating itself a lot.

The recent success of LLMs in classifying implicit motives has demonstrated the potential of LLMs to automate complex tasks like motive classification. The advancements in LLM technology have enabled them to capture more nuanced semantic representations of text, leading to more accurate and reliable classifications. This has significant implications for the usefulness of LLMs in various real-world applications, including motive classification. The ability of LLMs to capture subtle semantic representations of text allows for more accurate and reliable classifications, making them well-suited for complex tasks like motive classification. The recent success of LLMs in this area highlights the potential of LLMs to automate complex tasks like motive classification, paving the way for future advancements in this field.

Concluding thoughts

There is definite promise here in terms of automating blog content, and I can see how attractive such possibilities are to many people. A process like the one demonstrated here moves a lot of the creativity from the production phase to the planning phase of content generation. This is a different process than many writers follow (some are planners, some plan as they go, some just go, others do some combination), and while it may be more productive, that doesn’t mean it gives a better result. The blog post that my AI “agent” produced was readable, though not particularly interesting, and it wasn’t fully accurate. But it does give me a starting point from which to work.

The bigger issue with this kind of a tool is that it can constrain your direction. Simply by generating this blog post, I have “locked in” a series of ideas regarding how I discuss implicit motive classification. I am less likely to edit the voice/style of statements I agree with, and I may take the easy route of using this language rather than trying to say things in my own words. This contributes to the “ensloppification” of the internet, and is extremely problematic at scale.

There are additional things that can be done to mitigate such pressures and improve the result. Though, given that I wrote this post in about 2 hours without the aid of AI (not including the coding/testing time) and then spent a couple more editing it, I wonder if it’s worth spending the time to finesse the blog generator tool. I’ll have some more thoughts on this in another post, but hopefully this tutorial has been helpful for you. Do get in touch with any questions/comments.