table-of-bot-metrics.md: Escape robot names for Markdown table

Some characters which could occur in a crawler's name have a special meaning in
Markdown. They are escaped to prevent them from having unintended side effects.

The escaping is only applied to the first (Name) column of the table. The rest
of the columns is expected to already be Markdown encoded in robots.json.
This commit is contained in:
Dennis Camera 2025-02-18 10:12:04 +01:00
parent a884a2afb9
commit 0bd3fa63b8
2 changed files with 26 additions and 22 deletions

View file

@ -121,13 +121,17 @@ def json_to_txt(robots_json):
return robots_txt
def escape_md(s):
return re.sub(r"([]*\\|`(){}<>#+-.!_[])", r"\\\1", s)
def json_to_table(robots_json):
"""Compose a markdown table with the information in robots.json"""
table = "| Name | Operator | Respects `robots.txt` | Data use | Visit regularity | Description |\n"
table += "|-----|----------|-----------------------|----------|------------------|-------------|\n"
table += "|------|----------|-----------------------|----------|------------------|-------------|\n"
for name, robot in robots_json.items():
table += f'| {name} | {robot["operator"]} | {robot["respect"]} | {robot["function"]} | {robot["frequency"]} | {robot["description"]} |\n'
table += f'| {escape_md(name)} | {robot["operator"]} | {robot["respect"]} | {robot["function"]} | {robot["frequency"]} | {robot["description"]} |\n'
return table