Skip to content

A code training data set made from the Rosetta code project.

Notifications You must be signed in to change notification settings

TheOneTrueGuy/RosettaDataSet

Repository files navigation

A repo for working on converting the Rosetta Code problem space into machine learning data for a training set. Rosetta code is a programming chrestomathy with hundreds of task and algorithm types each expressed in several hundred different programming languages.

more data can be found here: https://github.com/ingydotnet/rosettacode-pm and here: https://rosettacode.org/wiki/Rosetta_Code

you will have to git clone https://github.com/acmeism/RosettaCodeData in order to have the data that makes any of the rcds*.py scripts run. So far none of them have worked perfectly. I've only put about an hour into this with the help of ChatGPT, Claude, Copilot and Gemini. I cycle through them giving each a chance and when they get hung up I ask them to restate the objective to improve the creation of the next attempt which I then pass on to the next one. objective2.txt is the current version of the generated objective, it is actually an ongoing concetenation.

news! rcds10.py seems to work. but I probably still have formatting issues due to the mixed formats of the various languages and the web display preparation necessary for each type of answer. Please, if anyone tries to use the .json file let me know how it goes.

About

A code training data set made from the Rosetta code project.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages