# Gamebooks and graph theory

Posted on 27 October 2019 in data-science

A game book is, contrary to the usual books, a book you don't read pages sequentially. These books are read interactively. You are offered a choice after a paragraph: go to the right turn to section 7, go to the left turn to 138. That's it. Depending on the series, you might have additional rules: fight, magic, psi power, etc.

*An example of a graph*

During the winter holidays, I thought a bit more about these books. They could be encoded as directed graph networks. Therefore I could probably apply a bunch of network algorithms to them to extract interesting information such as:

- the shortest path to an instant death,
- the path with the most fights,
- etc.

I chose for my analysis the Lone Wolf series because, luckily for me, some people have put all the books in an electronic format and it's legal. It's a story about Lone Wolf (unexpected I know) the last of his kind, a caste of warrior monks.

NB: *The Dawn of the Darklords* was excluded from the analysis as it was not officially released as a gamebook. It was included in the Magnamund companion.

## TL;DR

- The
*Masters of Darkness*has the most action packed with a possible solution path including**65**fights; - The shortest path to death is
*The Kingdoms of Terror*with only a**5**section path; - The
*Caverns of Kalte*is the most deadly adventure with**19**instant death sections; - The shortest adventure is
*Flight from the Dark*with a solution path only**27**sections long; - The longest adventure which can be done is
*The Shadow on the Sand*with a touristic path of**224**sections, more than half of the sections (400).

## Summary

Here is a summary for the 28 books I've analyzed. For now Lonewolf comprises 4 series:

- Kai series: 1 to 5
- Magnakai series: 6 to 12
- Grand Master series: 13 to 20
- New Order series: 21 to 32 (but now only 28 in project aon)

The values reported below are the average value for each category. Something interesting we can see if that from the 3rd series, there are no more cycles and the shortest path has increased on average 50% compared to the 1st and 2nd series. Also, the shortest path to death has tripled and the number of insta-death was halved.

Over time, the books might have become more focused on adventure and story but also less punishing. Having only read the first couple of books, I can't comment on this but if somebody has an opinion on this, I would be happy to hear about it and maybe update the post with your comments.

series | shortest path | shortest path to death | path with the most fights | # of fight | # of luck | # of death | # of cycles | longest path |
---|---|---|---|---|---|---|---|---|

1-kai | 51 | 12 | 20 | 43 | 25 | 11 | 11 | 153 |

2-magnakai | 66 | 10 | 21 | 42 | 23 | 15 | 5 | 171 |

3-grand-master | 95 | 37 | 18 | 40 | 37 | 7 | 0 | 163 |

4-new-order | 97 | 30 | 12 | 27 | 43 | 6 | 0 | 156 |

# The technical bits

## The preparation: Turn to 1

"Turn to" are the mythic words in these game books. It's also how we will divide the different sections of text, by using regexp. There are 5 types of section and their assigned color:

**normal**: you can move forward to another section (white),**luck**: you are asked to test your luck and you can move forward to another section (green),**fight**: you are asked to fight some monster(s) and you can move forward to another section (yellow),**death**: you chose badly and you got yourself killed, you have to restart from the section 1 (red),**start/end**: first section and last section (blue).

*Dawn of the darklords graph*

Once the sections are defined, we have to create the directed graph. To do so, I used two python libraries:

## Extracting the interesting information: Test your Luck

I mainly used `networkx`

for the graph network analysis. It's a straightforward library and the documentation is good.

### Do you need a DAGger?

Typically a Lonewolf adventure is the equivalent of a Directed (A)Cyclic Graph:

**Directed**: Lonewolf, your character, goes from the section 1 to hopefully the latest section which is depending on the book the section 300, 350 or 400, without the possibility to come back to the previous section;**Acyclic**: This is not totally true for the Lonewolf series as 7 books contain cycles, a "circular" path between two nodes which a node is repeated twice. Some algorithms like the shortest path or longest path require a DAG and we need to remove the cycles before running them.**Graph**: The sections are the nodes of the graph and the vertices the choices for each section.

### Disconnected graphs

In several books, graphs are disconnected. It means you can't go from the section 1 to the end section (300 or 350). This indicates usually that there is an enigma or puzzle asking to add numbers discovered along the adventure and reach the section given by the number. The only way to process these graphs is to check the text notes and add the missing edges manually.

### Cycle removal

The cycle removal is an interesting problem as it is one of the first problem to have been shown as NP-complete (NP stands for Non deterministic Polynomial time). This means that there is no known way to find a solution to solve that problem quickly and the time to find a solution grows as the size of the input grows. Nonetheless we are lucky because the data from a gamebook is usually quite small (300 to 400 nodes and 400 to 600 edges)!

The idea behind the cycle removal is simple:

- do a DFS search,
- look at the nodes and their children,
- if one or more children have been already visited, remove that edge.

# Conclusion

It was a cool project to do. I brushed up on the graph theory which I never really used outside of university. I would like to see if there are any correlations between the features I chose and the popularity of the gamebooks.