400 Error when trying to use jsonlines file with OpenAI answers - openai-api

I uploaded a file to openai for the purpose of answers. I can't figure out how to access it without getting an error.
This is the contents of my file:
{"text":"We are focusing on improving communication to the customers."}
{"text":"The emphasis on the customer is working well."}
{"text":"Services initiatives"}
{"text":"CMM and focus on customer service."}
{"text":"We are working on CMM to focus on the customer."}
{"text":"Our emphasis on making our customers number one."}
{"text":"Customer relationships are continuing to improve."}
This is the information about the file when I query openai:
{"object":"file","id":"file-02fdY5PuZ1aO4cnd2r1PxaB7","purpose":"answers","filename":"f3f1c105-5217-4132-8950-a040b6183ed7","bytes":458,"created_at":1648396701,"status":"processed","status_details":null}
Now I ask a question at https://api.openai.com/v1/answers with this POST:
{"file":"file-02fdY5PuZ1aO4cnd2r1PxaB7","question":"What is the main theme?","search_model": "davinci","model": "davinci","examples_context": "In 2017, U.S. life expectancy was 78.6 years.","examples": [["What is human life expectancy in the United States?","The life expectancy in the United States is 78 years."]],"max_tokens": 100,"stop": ["\n", "<|endoftext|>"]}
The server returns a 400 error. If in my request I replace the "file" field with a "documents" array using same data, the request is successful.
I don't know what more to look at. If someone can suggest something or if you seen an issue in what I am trying to do, I would deeply appreciate hearing from you.
Thanks

Related

Fine Tuning GPT-3 for Consistent Output Format

I am trying to use Open AI API to create quiz questions with three incorrect answers and one correct answer. The prompt I use is
`Write a quiz on ${quiz_topic} with 5 questions. Each question has 3 incorrect answers and 1 correct answer. The correct answer is always the last answer. Write each answer on a separate line`
I would like to parse ChatGPT's answer but the format is inconsistent. For instance, is quiz_topic is "literature", I get the following response:
Q1. Who wrote the novel "The Catcher in the Rye"?
A. J.D. Salinger
B. Ernest Hemingway
C. F. Scott Fitzgerald
D. Mark Twain
This is the correct format. But if I use "U.S History", I get the following output:
Q1: Who was the first president of the United States?
A1: George Washington Jr., Thomas Jefferson, John Adams, George Washington
This is a different format. So my goal is to fine-tune it using the article here
I have created 200 line of a training set. Here is a sample:
{"prompt":"Write a quiz on general knowledge with 5 questions. Each question has 3 incorrect answers and 1 correct answer. The correct answer is always the last answer.Write each answer on a separate line \n\n---\n\n","completion":" Q. What is the Italian word for "tomato"?\nA. Aglio\nA. Cipolla\nA. Peperoncino\nA. Pomodoro\n\nQ. Which restaurant's mascot is a clown?\nA. Whataburger\nA. Burger King\nA. Sonic\nA. McDonald's\n\nQ. Which of these colours is NOT featured in the logo for Google?\nA. Yellow\nA. Blue\nA. Green\nA. Pink\n\nQ. In 2013 how much money was lost by Nigerian scams?\nA. $95 Million\nA. $956 Million\nA. $2.7 Billion\nA. $12.7 Billion\n\nQ. What is the name of Poland in Polish?\nA. Pupcia\nA. Polszka\nA. PĆ³land\nA. Polska\n\n \n\n###\n\n"}
{"prompt":"Write a quiz on books with 5 questions. Each question has 3 incorrect answers and 1 correct answer. The correct answer is always the last answer.Write each answer on a separate line \n\n---\n\n","completion":" Q. What is the name of Eragon's dragon in "Eragon"?\nA. Glaedr\nA. Thorn\nA. Arya\nA. Saphira\n\nQ. In the "The Hobbit", who kills Smaug?\nA. Bilbo Baggins\nA. Gandalf the Grey\nA. Frodo\nA. Bard\n\nQ. What is Hermione Granger's middle name?\nA. Jane\nA. Emma\nA. Jo\nA. Jean\n\nQ. According to The Hitchhiker's Guide to the Galaxy book, the answer to life, the universe and everything else is...\nA. Loving everyone around you\nA. Chocolate\nA. Death\nA. 42\n\nQ. What is the name of the three headed dog in Harry Potter and the Sorcerer's Stone?\nA. Spike\nA. Poofy\nA. Spot\nA. Fluffy\n\n \n\n###\n\n"}
When I run the validation tool with the command
openai tools fine_tunes.prepare_data -f training.jsonl
I get the following message
- All prompts start with prefix `Write a quiz on `. Fine-tuning doesn't require the instruction specifying the task, or a few-shot example scenario. Most of the time you should only add the input data into the prompt, and the desired output into the completion
I don't understand why I must remove "Write a quiz on". So I have misunderstood how to fine-tune a model for consistent formatting.
Can anybody shed a light on how to make sure I get the same formatting with the same prompt

I've been working on a discord bot for at least a week and just today, my joke command started saying invalid syntax

I have been working on a discord.py bot when yesterday, on my bus, some of my code got deleted. Everything was working fine, but now my joke command doesn't work. I am pretty new to coding as this is my 2nd week, but can someone explain to me why this says invalid syntax?
#bot.command()
async def joke(ctx):
responses = [
"Two hunters are out in the woods when one of them collapses. He's not breathing so his friend calls 911. My friend is dead! What should I do? The operator replies, Calm down sir, first make sure that he's really dead. There's a silence, then a loud bang. Back on the phone, the guy says, Ok, now what?",
"I threw a boomerang a few years ago; now I live in constant fear.",
"Someone stole my mood ring. I don't know how I feel about that.",
"Women call me ugly until they find out how much money I make. Then they call me ugly and poor.",
"You're not completely useless: you can serve as a bad example.",
"I broke my finger last week. On the other hand, I'm okay.",
"Appearantly someone in London gets stabbed every 52 seconds. Poor bastard.",
"A Roman legionaire walks into a bar, holds up two fingers, and says, Five Beers please!"
await ctx.send(f'{random.choice(responses)}')
Your missing a closing bracket at the end of your list.

Searching a lot of keywords on twitter via tweepy

I am trying to make a python code with tweepy that will track all the tweets from a specific country from a date which will have some of the chosen specific keywords. And I have chosen a lot of keywords like 24-25.
My keywords are vigilance anticipation interesting ecstacy joy serenity admiration trust acceptance terror fear apprehensive amazement surprize distraction grief sadness pensiveness loathing disgust boredom rage anger annoyance.
for more understanding, my code till now is:
places = api.geo_search(query="Canada",granularity="country")
place_id = places[0].id
public_tweets = tweepy.Cursor(api.search,
q="place:"+place_id+" since:2020-03-01",
lang="en",
).items(num_tweets)
Please help me with this question as soon as possible.
Thank You

Searching closest venues using ll and radius not working properly

I know there are a lot of questions about this issue, but I've reached a point where I can't really do anything else but to ask if somebody else has a solution for this issue...
Using the Foursquare api explorer to test out my query I can't seem to obtain an accurate or even good fit for the data I need to obtain.
It is quite simple. I need to obtain the closest venue from a set of coordinates. I don't mind not having results if nothing is found near by.
So, reading the API documentation (https://developer.foursquare.com/docs/venues/venues) I conclude that I need a search and not an explore because I don't want sugestions of recommended venues (and the results when I tested it proved that it wasn't what I was expecting).
So, using search api I want to find places (the place, but places would do...) close to these coordinates
ll=37.424782,-122.162989
considering that I want places close by, I add
radius=51
and I don't really want many results
limit=2
from the documentation I see that radius is
Only valid for requests with intent=browse, or requests with
intent=checkin and categoryId or query
so, I use
intent=browse
which concludes my query to:
venues/search?intent=browse&ll=37.424782,-122.162989&radius=51&limit=2
Query Result:
https://developer.foursquare.com/docs/explore#req=venues/search%3Fintent%3Dbrowse%26ll%3D37.424782,-122.162989%26radius%3D51%26limit%3D2
Here we can see that the first result is straight outside of the radius ... distance: 135
the second result however is cool ... distance: 50
What am I doing wrong to get these results? If I increase the limit all I get is more results that are also outside the radius, I could iterate through them and find the one with the smallest distance... but I have no guarantee that the closest result will be on the top X that I limit, even If I had that guarantee, it would be a tiresome solution to an apparently simple question...
Thanks for the help...
Marc
EDIT:
I managed to make have the query perform as I intended ... But I had to add all of the parent categories from:
https://developer.foursquare.com/categorytree
categoryId=
4d4b7104d754a06370d81259, Arts & Entertainment
4d4b7105d754a06372d81259, College & University
4d4b7105d754a06373d81259, Event
4d4b7105d754a06374d81259, Food
4d4b7105d754a06376d81259, Nightlife Spot
4d4b7105d754a06377d81259, Outdoors & Recreation
4d4b7105d754a06375d81259, Professional & Other Places
4e67e38e036454776db1fb3a, Residence
4d4b7105d754a06378d81259, Shop & Service
4d4b7105d754a06379d81259 Travel & Transport
making my query into:
venues/search?
intent=checkin&ll=37.424782,-122.162989&radius=60&categoryId=4d4b7104d754a06370d81259,4d4b7105d754a06372d81259,4d4b7105d754a06373d81259,4d4b7105d754a06374d81259,4d4b7105d754a06376d81259,4d4b7105d754a06377d81259,4d4b7105d754a06375d81259,4e67e38e036454776db1fb3a,4d4b7105d754a06378d81259,4d4b7105d754a06379d81259
Query Result:
https://developer.foursquare.com/docs/explore#req=venues/search%3Fintent%3Dcheckin%26ll%3D37.424782,-122.162989%26radius%3D60%26categoryId%3D4d4b7104d754a06370d81259,4d4b7105d754a06372d81259,4d4b7105d754a06373d81259,4d4b7105d754a06374d81259,4d4b7105d754a06376d81259,4d4b7105d754a06377d81259,4d4b7105d754a06375d81259,4e67e38e036454776db1fb3a,4d4b7105d754a06378d81259,4d4b7105d754a06379d81259
It still has results outside of my radius still ... but it's an acceptable error margin ... it is weird however.
I managed to make have the query perform as I intended ... But I had to add all of the parent categories from:
https://developer.foursquare.com/categorytree
categoryId=
4d4b7104d754a06370d81259, Arts & Entertainment
4d4b7105d754a06372d81259, College & University
4d4b7105d754a06373d81259, Event
4d4b7105d754a06374d81259, Food
4d4b7105d754a06376d81259, Nightlife Spot
4d4b7105d754a06377d81259, Outdoors & Recreation
4d4b7105d754a06375d81259, Professional & Other Places
4e67e38e036454776db1fb3a, Residence
4d4b7105d754a06378d81259, Shop & Service
4d4b7105d754a06379d81259 Travel & Transport
making my query into: venues/search?
intent=checkin&ll=37.424782,-122.162989&radius=60&categoryId=4d4b7104d754a06370d81259,4d4b7105d754a06372d81259,4d4b7105d754a06373d81259,4d4b7105d754a06374d81259,4d4b7105d754a06376d81259,4d4b7105d754a06377d81259,4d4b7105d754a06375d81259,4e67e38e036454776db1fb3a,4d4b7105d754a06378d81259,4d4b7105d754a06379d81259
Query Result: https://developer.foursquare.com/docs/explore#req=venues/search%3Fintent%3Dcheckin%26ll%3D37.424782,-122.162989%26radius%3D60%26categoryId%3D4d4b7104d754a06370d81259,4d4b7105d754a06372d81259,4d4b7105d754a06373d81259,4d4b7105d754a06374d81259,4d4b7105d754a06376d81259,4d4b7105d754a06377d81259,4d4b7105d754a06375d81259,4e67e38e036454776db1fb3a,4d4b7105d754a06378d81259,4d4b7105d754a06379d81259
It still has results outside of my radius still ... but it's an acceptable error margin ... it is weird however.
Although this question is old I'm responding for others. I was working on something similar to this recently and what I learned was that in order to use radius you also need to use the 'query' parameter. What I did was to use the star character '*' and it worked for me. I have to say though that the limit of 50 is something I haven't solved yet which I'm working on at the moment.

Using Google Geocoding with Python

I was just curious if anyone has had some success installing and importing the Geocoding module into Python. I believe I have successfully done this, but then the results of my queries are atypical.
For instance, whenever I enter an address with just a street name and number, it gives me a Query Error like this:
geopy.geocoders.googlev3.GQueryError: The geocode was successful but returned no results. This may occur if the geocode was passed a non-existent address or a latlng in a remote location.
However, if I include a city and/or state in the address, the query will run and return the place as only the city and state, and will basically disregard the street number and street name and will give me the lat and lng of the center of that city.
Example:
place,(lat,lng)= g.geocode("4224 Evans to Locks Road Augusta Georgia")
I just was wondering if there was perhaps a problem in my installation or something that is preventing the geocode from working. I honestly have no idea how this kind of problem occurs and am very new to Geocoding. I have moderate experience working with Python however. Any help is appreciated.
I'm not too familiar with Geopy, however if you use Geocoder on GitHub/PyPi you should be able to accomplish everything you need, lots of documentation to help you get started.
Here's how you would use it following your example:
>>> import geocoder
>>> g = geocoder.google("4224 Evans to Locks Road Augusta Georgia")
>>> g.latlng
(33.5411157, -82.11359039999999)
>>> g.json
...
How to install:
pip install geocoder

Resources