Training on custom data #12

waleedrazakhan92 · 2023-01-20T20:27:40Z

Hello, can you share the details on how to train the model on custom data. Having gone through the paper, i believe we need:

Input images (custom images)
Masks (produced with face parsing, only for skin )
Depth Masks (Same masks as above but without nose and mouth)
Depth Map (How to produce those)
Albedo Map (using SfSNet), but you mention in your paper that you convert it into grayscale first. Did i get that correctly?
Light directions. What are those and how to obtain them?

Is there anything else needed to train the model?
If not then can you please direct me towards how to find the missing data, 4,5 and 6 to train on my custom images.

andrewhou1 · 2023-01-20T21:04:49Z

To produce the depth maps, you can use https://github.com/zqbai-jeremy/DFNRMVS

The depth masks correspond to any pixel that has a valid face depth.

Yes, the albedo is converted first to grayscale.

The lighting directions are also produced by SfSNet: I use the first order coefficients (2-4), normalize them, and treat that as the lighting direction.

waleedrazakhan92 · 2023-01-20T21:48:53Z

@andrewhou1 thankyou for a quick response. Also if i wan to change the resolution of the model from 256 to 512 or 1024 what changes I need to make, besides self.img_height and self.img_width, in the model to incorporate this change in size.

andrewhou1 · 2023-01-21T00:59:23Z

Yes that definitely needs to be changed. Also, change all instances of 256 to your new resolution. There's also this line:

sample_increments = torch.reshape(torch.tensor(np.arange(0.025, 0.825, 0.005)), (self.num_sample_points, 1, 1, 1))

You may want to increase self.num_sample_points given the larger resolution and adjust np.arange accordingly to match.

waleedrazakhan92 · 2023-01-21T09:44:01Z

@andrewhou1 I have changed the suggested input changes from 256 to 1024 wherever i can and the model takes 1024 input. I need advice and suggestions on a few more things to keep the performance high.

Can you explain what increasing self.num_sample_points does and more importantly how much should i increase it to?

Also since by changing the resolution the h4_out shape is also changed from [1,155,16,16] to [1,155,64,64] (for 1024 resolution). Should there need to be a change in the selection of indexes for identity_features and lighting_features

GeomConsistentFR/test_relight_single_image.py

Lines 198 to 200 in 5448302

    
           identity_features = h4_out[:, 0:128, :, :] 
        
           lighting_features = h4_out[:, 128:155, :, :] 
        
           LF_shape = list(lighting_features.size())

.

And also in

GeomConsistentFR/test_relight_single_image.py

Lines 203 to 205 in 5448302

    
           LF_avg_pool = self.AvgPool_LF(lighting_features) 
        
           SL_lin1 = F.leaky_relu(self.linear_SL1(LF_avg_pool.permute(0, 2, 3, 1)), 0.2) 
        
           SL_lin2 = self.linear_SL2(SL_lin1)

The average pooing size need to be also (16,16) to (64,64) which seems now to be quite a big window to be average pooling from. Do you suggest changing the average pooling size or the linear_SL input and output size for the model to keep its performance.

andrewhou1 · 2023-01-21T16:56:21Z

self.num_sample_points is the number of points that are sampled along each ray to determine if the original point on the face is under a cast shadow. If the points are sampled too sparsely, they may miss an occluding surface (such as the nose) and incorrectly determine a point to be well illuminated. This results in white stripes in the cast shadows, so self.num_sample_points should be set sufficiently high. If you want to maintain the same sampling frequency as I had in 256 resolution, increase the sampling rate 4x for 1024. Also change the np.arange portion to match this change. This is an experimental parameter: you can also lower the sampling rate as well and observe the effect on the performance, but you should not need to set it any higher than 4x its current setting.

For the other two tensors, I believe 64x64 should be fine. If the performance seems noticeably worse and you want to change to 32x32 or 16x16, you would need to add one or two more downsampling and upsampling blocks respectively.

waleedrazakhan92 · 2023-01-21T21:25:49Z

To produce the depth maps, you can use https://github.com/zqbai-jeremy/DFNRMVS

The depth masks correspond to any pixel that has a valid face depth.

Yes, the albedo is converted first to grayscale.

The lighting directions are also produced by SfSNet: I use the first order coefficients (2-4), normalize them, and treat that as the lighting direction.

Hi so I've been trying to get light directions from the SFS model. I couldn't get the matlab version to work but found a working pytorch version https://github.com/Mannix1994/SfSNet-Pytorch. I'm getting the outputs as expected, however i wanted to be clear about the light directions still. In the code there is an explanation about the light_out https://github.com/Mannix1994/SfSNet-Pytorch/blob/c2c1ed96b20dab66c5f84fe41ccb5d08aaa2291a/SfSNet_test.py#L66-L72
which i understand is the output determining the light direction. You mentioned that we need the first order coefficients and normalize them to get as light directions. Now in the code they are getting these 27 outputs (9 for each channel), which they reshape and form a 3 channel shade image.
So how do i normalize this output to form the training lighting inputs like you have provided in the traning dataset.

andrewhou1 · 2023-01-21T23:37:32Z

So among those 27 outputs, you can reshape into a 9x3 matrix, where each column is the SH for each color channel. Then simply take the average to get a single 9x1 vector. You can use this to determine your lighting directions.

waleedrazakhan92 · 2023-01-24T11:18:41Z

@andrewhou1 but wouldn't that give me a 9x1 vector but in the training lightings .mat files, for each image there are just three values per image. So I'm unsure still about the exact process of how to get the format and the values like you've provided for training.
Con you please share the same process or a piece of code that you used with which i can obtain the exact values in the exact format for the same image.

andrewhou1 · 2023-01-24T16:30:41Z

Right so then you can use the 2nd, 3rd, and 4th values and normalize them as a vector.

waleedrazakhan92 · 2023-01-25T15:10:34Z

@andrewhou1 can you tell how long(time in hours) did it take to train the final model?

andrewhou1 · 2023-01-25T17:21:30Z

At 256 resolution it took about 1 day to train. However at 1024 resolution, it would dramatically increase (maybe up to 4x) if you increase the sampling rate proportionally.

waleedrazakhan92 · 2023-01-25T17:37:06Z

@andrewhou1 also does the shape in both these tensors have anything to do with the batch size? Because when i try to change the batch size there is a shape mismatch error. torch.tensor([[[0.0]], [[0.0]], [[0.0]]]) and torch.reshape(tmp_incident_light_z, (3 1, 1, 1)))

GeomConsistentFR/train_raytracing_relighting_CelebAHQ_DSSIM_8x.py

Lines 358 to 359 in 5448302

    
           tmp_incident_light_z = torch.maximum(tmp_incident_light[:, 2], torch.tensor([[[0.0]], [[0.0]], [[0.0]]]).cuda()) 
        
           incident_light = torch.cat((tmp_incident_light[:, 0:2], torch.reshape(tmp_incident_light_z, (3, 1, 1, 1))), 1)

andrewhou1 · 2023-01-25T17:41:28Z

yes it does. So if the batch size is n, then torch.tensor should have n of those 0.0s and torch.reshape(tmp_incident_light_z, (n, 1, 1, 1)) should be used.

waleedrazakhan92 · 2023-01-25T17:47:55Z

Thankyou, so if i replace these lines as :
tmp_incident_light_z = torch.maximum(tmp_incident_light[:, 2], torch.zeros(self.batch_size,1,1).float().cuda())

incident_light = torch.cat((tmp_incident_light[:, 0:2], torch.reshape(tmp_incident_light_z, (self.batch_size, 1, 1, 1))), 1)
This is the correct way right?

andrewhou1 · 2023-01-25T17:52:10Z

Right, that should be correct.

waleedrazakhan92 · 2023-01-25T18:31:51Z

One more question. You mention in that you upscaled the results from SfsNet from 128 to 256 resolution. Now as for the lighting directions, did you use the same values that were calculated on the 128 resolution and used them for 256 resolution to train your model?

If thats the case then if i just upscale the images again to 512 and use the same lighting direction values that you provided to train the model, would that be okay?

Also does self.batch_size need to be the same for both the training and testing? for example if i trained the model on batch size 2 then during the testing do i also have to set the self.batch_size to 2 as well?

andrewhou1 · 2023-01-25T18:32:54Z

Correct, the lighting directions are independent of resolution.

yafeim · 2023-02-03T00:40:05Z

Hello @andrewhou1 , I notice that SfSnet albedo images do not align well with the original image. How do you solve this problem? Thanks.

andrewhou1 · 2023-02-03T01:45:55Z

Thanks for your interest in our work!

Did you crop the original images first using our provided cropping code? They should align if the images are cropped.

yafeim · 2023-02-04T20:08:34Z

Thanks for your quick reply. The first attached image is what I got from the cropping code, and the second is the provided albedo in MP_data. They are still misaligned. Can you advise? Thanks.

yafeim · 2023-02-04T20:12:22Z

Also, I got "assert img.shape[0] == img.shape[1] == 256
AssertionError" when applying the cropping logic to image "10587.jpg". Is the image discarded?

andrewhou1 · 2023-02-04T20:56:33Z

Hmmm that's interesting.....I inspected that training image on my end and the image matches the grayscale albedo (with the chin reflected). Did you install the separate dependencies for the cropping code? They're different from the dependencies for running the relighting model. If you did, you can try changing borderType=cv2.BORDER_DEFAULT to cv2.BORDER_REFLECT

andrewhou1 · 2023-02-04T20:58:26Z

Also yes, 10587.jpg was discarded

yafeim · 2023-02-04T22:17:42Z

Oh I see. I think the problem is that I installed a different version of opencv using pip. Now I am getting aligned crops. Thanks a lot.

andrewhou1 closed this as completed Feb 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training on custom data #12

Training on custom data #12

waleedrazakhan92 commented Jan 20, 2023

andrewhou1 commented Jan 20, 2023

waleedrazakhan92 commented Jan 20, 2023

andrewhou1 commented Jan 21, 2023 •

edited

Loading

waleedrazakhan92 commented Jan 21, 2023

andrewhou1 commented Jan 21, 2023

waleedrazakhan92 commented Jan 21, 2023 •

edited

Loading

andrewhou1 commented Jan 21, 2023

waleedrazakhan92 commented Jan 24, 2023 •

edited

Loading

andrewhou1 commented Jan 24, 2023

waleedrazakhan92 commented Jan 25, 2023

andrewhou1 commented Jan 25, 2023

waleedrazakhan92 commented Jan 25, 2023

andrewhou1 commented Jan 25, 2023

waleedrazakhan92 commented Jan 25, 2023

andrewhou1 commented Jan 25, 2023

waleedrazakhan92 commented Jan 25, 2023 •

edited

Loading

andrewhou1 commented Jan 25, 2023

yafeim commented Feb 3, 2023

andrewhou1 commented Feb 3, 2023

yafeim commented Feb 4, 2023 •

edited

Loading

yafeim commented Feb 4, 2023

andrewhou1 commented Feb 4, 2023

andrewhou1 commented Feb 4, 2023

yafeim commented Feb 4, 2023

Training on custom data #12

Training on custom data #12

Comments

waleedrazakhan92 commented Jan 20, 2023

andrewhou1 commented Jan 20, 2023

waleedrazakhan92 commented Jan 20, 2023

andrewhou1 commented Jan 21, 2023 • edited Loading

waleedrazakhan92 commented Jan 21, 2023

andrewhou1 commented Jan 21, 2023

waleedrazakhan92 commented Jan 21, 2023 • edited Loading

andrewhou1 commented Jan 21, 2023

waleedrazakhan92 commented Jan 24, 2023 • edited Loading

andrewhou1 commented Jan 24, 2023

waleedrazakhan92 commented Jan 25, 2023

andrewhou1 commented Jan 25, 2023

waleedrazakhan92 commented Jan 25, 2023

andrewhou1 commented Jan 25, 2023

waleedrazakhan92 commented Jan 25, 2023

andrewhou1 commented Jan 25, 2023

waleedrazakhan92 commented Jan 25, 2023 • edited Loading

andrewhou1 commented Jan 25, 2023

yafeim commented Feb 3, 2023

andrewhou1 commented Feb 3, 2023

yafeim commented Feb 4, 2023 • edited Loading

yafeim commented Feb 4, 2023

andrewhou1 commented Feb 4, 2023

andrewhou1 commented Feb 4, 2023

yafeim commented Feb 4, 2023

andrewhou1 commented Jan 21, 2023 •

edited

Loading

waleedrazakhan92 commented Jan 21, 2023 •

edited

Loading

waleedrazakhan92 commented Jan 24, 2023 •

edited

Loading

waleedrazakhan92 commented Jan 25, 2023 •

edited

Loading

yafeim commented Feb 4, 2023 •

edited

Loading