DCGAN---LOGOS

Programs: cgan.py:
• the main program to run. It builds a DCGAN model, read embedidngs, labels and images into memory and feeds them to the DCGAN. • Images are procressed to be 64643. Pixel values are normalized to [-1,1] • labels are text embedings, with 1*100 (float32) • noise z are randomly generated numbers between [-1,1]

ops.py: • contains function to support the cgan.py

extract.py: • doing knn search according to given base images. And extract found nearest nebhours, resize found images and put them in a folder (called selected). • Store final app id and corresponding text embeddings in final_nearest_combined_v2.p as python dictionary. Note that corresponding text embedidng

Relavant files: mapping.csv: store the mapping of every app to group number (file number of the text embeddings, which has 1002 files)

picked_embed_v2.p: store base app ids and text embeddings as python dictionary. Keys are app ids and values are corresponding text embeddings. App ids are encoded with the group number. For example: for app com.ea.games.r3_row of group 506, its app id will be encoded as com.ea.games.r3_row*506. Generated by extract.py

final_nearest_combined_v2.p: store found app ids and corresponding text embeddings as python dictionary. Because text embeddings are used as labels, so apps with the same base apps are changed to be the same labels. Keys are app ids and values are corresponding text embeddings. App ids are encoded with the group number. For example: for app com.ea.games.r3_row of group 506, its app id will be encoded as com.ea.games.r3_row*506. Generated by extract.py

out.o: debug information of extract.py will be printed in this file results.txt: debug information of cgan.py will be printed in this file

All .pbs files are job files to be submitted to the HPC.

Sequence to build the model:

run extract.py to obtain the training set.
Run cgan.py to train the model. Training samples will be generated every 5 epoches. Sample pictures are stored in results/pics_100/ check points are saved in results/check_point/

Files in ./www_embeddings/: there are 23 combined embedding files, eg: from embeddings_0 to embeddings_22. But the embeddings_0.npy is empty so only 22 files are needed. Inside each of these of 22 files there is a 60000 *12288 matrix of floats. They are : 1 app id + 4096 content embeddings + 4096 style embeddings + 100 text embedidngs.

Files in ./Text_Embedding/: there are 1001 text embedding files. Each is a python dictionary with app id as the key and the embedding as the value. Since all apps are divided into 1001 groups, the group number refered in the codes corresponds to the file number here.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
cgan.pbs		cgan.pbs
cgan.py		cgan.py
embedding_mapping.pbs		embedding_mapping.pbs
embedding_mapping.py		embedding_mapping.py
extract.pbs		extract.pbs
extract.py		extract.py
ops.py		ops.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DCGAN---LOGOS

About

Releases

Packages

Languages

jqin4749/DCGAN---LOGOS

Folders and files

Latest commit

History

Repository files navigation

DCGAN---LOGOS

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages