2 Comments

Could you put your scripts in Github somewhere?

Expand full comment

Uh it's a bit messy. This is the etymology-checking function:

```

import requests

import json

import re

# define the API endpoint and parameters

endpoint = "https://en.wiktionary.org/w/api.php"

params = {

"action": "query",

"format": "json",

"prop": "extracts",

"titles": "",

}

def wiktionary_is_germanic(word):

# add the word to the API parameters

params["titles"] = word

# send a GET request to the API

response = requests.get(endpoint, params=params)

# parse the JSON response

data = json.loads(response.text)

# extract the etymology from the response

try:

pages = data["query"]["pages"]

page_id = list(pages.keys())[0]

extract = re.sub('<[^<]+?>', '', pages[page_id]['extract'])

except KeyError as e:

print(word)

return True

if any(word in extract for word in ['French', 'Latin', 'Greek']):

return True

return False

```

I'm not sure which of the folders of essay texts I can post publicly. I want to find some large sample of newspaper articles or something to compare e.g. Orwell to non-famous writers.

Expand full comment