-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question regarding age training #211
Comments
@zevele
I may be able to reproduce your issue on docker container. |
Thanks @takuya-takeuchi ! OS - Windows 10 Is there anything I can do to help debugging this on my system? |
You can check whether gpu is running or not by
Just in case, do you use FaceRecognitionDotNet.CUDAXXX? |
I ran the train again and checked the GPU utilization - and it's not using the GPU. I'm following the wiki regarding training the model . I'm running the following command (as per the instructions). I also see you wrote there: |
Which cuda version do you install in your machine? |
Thanks! I tried installing CUDA 11.2 and also downloaded cudnn 11.2. Then I Added the path reference to the cudnn in the environment variables. Rebuilt the project using: But still zero GPU utilization. What am I missing? |
You need not to use FaceRecognitionDotNet. |
Actually I tried that and got an error - so I thought I needed both of them. With both of them the training runs - just without CUDA). If I do:
|
I had reprocued your issur but I'm not sure why issue occurs. But AgeTraining does not work even though link DlibDotNet.CUDA112. |
These libraries are deployed to app dir.
But error is still alive. |
Thanks @takuya-takeuchi I tried to copy the files manually - but I get the same error. Is there anything else that can be done? |
Note Simple dotnet test program links FRDN.CUDA112 works fine with cuda libs. using System;
using System.Collections.Generic;
using System.Drawing;
using System.Drawing.Imaging;
using System.IO;
using System.Linq;
using System.Net.Http;
using System.Reflection;
using System.Runtime.Serialization.Formatters.Binary;
using DlibDotNet;
using FaceRecognitionDotNet.Extensions;
using Xunit;
using Xunit.Abstractions;
namespace FaceRecognitionDotNet.Tests
{
public class FaceRecognitionTest
{
private readonly string ModelDirectory = "Models";
public FaceRecognitionTest(ITestOutputHelper testOutputHelper)
{
var dir = Environment.GetEnvironmentVariable("FaceRecognitionDotNetModelDir");
if (Directory.Exists(dir))
{
ModelDirectory = dir;
}
}
[Fact]
public void Test()
{
using(var fr = FaceRecognitionDotNet.FaceRecognition.Create(ModelDirectory))
{
}
}
}
} But simple console program links FRDN.CUDA112 does not work even if deploy cuda libs. using System;
using System.Collections.Generic;
using System.Drawing;
using System.IO;
using System.Linq;
using System.Threading.Tasks;
using FaceRecognitionDotNet;
namespace Issue
{
class Program
{
private static void Main(string[] args)
{
using(var fr = FaceRecognitionDotNet.FaceRecognition.Create("models"))
{
Console.WriteLine("test");
}
}
}
} |
And link issue occurs for only CUDA 11.X. |
Note D:\Works\OpenSource\Temp\FaceRecognitionDotNet\17.#211>dotnet run -c Release
Unhandled Exception: System.TypeInitializationException: The type initializer for 'DlibDotNet.NativeMethods' threw an exception. ---> System.DllNotFoundException: Unable to load DLL 'DlibDotNetNativeDnn': 指定されたモジュールが見つかりませ ん。 (Exception from HRESULT: 0x8007007E)
at DlibDotNet.NativeMethods.LossMetric_anet_type_create()
at DlibDotNet.NativeMethods..cctor()
--- End of inner exception stack trace ---
at DlibDotNet.NativeMethods.get_frontal_face_detector()
at FaceRecognitionDotNet.FaceRecognition..ctor(String directory)
at Issue.Program.Main(String[] args) in D:\Works\OpenSource\Temp\FaceRecognitionDotNet\17.#211\Program.cs:line 18 But it works. D:\Works\OpenSource\Temp\FaceRecognitionDotNet\17.#211\bin\Release\netcoreapp2.0>dotnet Issue.dll
test It looks like program runs in wrong current directory. D:\Works\OpenSource\Temp\FaceRecognitionDotNet\17.#211>dir
ドライブ D のボリューム ラベルは Data です
ボリューム シリアル番号は ACE6-77C8 です
D:\Works\OpenSource\Temp\FaceRecognitionDotNet\17.#211 のディレクトリ
2022/08/14 16:08 <DIR> .
2022/08/14 16:08 <DIR> ..
2022/08/14 15:21 <DIR> bin
2022/08/14 16:08 <SYMLINK> cublas64_11.dll [C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin\cublas64_11.dll]
2022/08/14 16:08 <SYMLINK> cublasLt64_11.dll [C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin\cublasLt64_11.dll]
2022/08/14 16:08 <SYMLINK> cudnn64_8.dll [C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin\cudnn64_8.dll]
2022/08/14 16:08 <SYMLINK> cudnn_adv_infer64_8.dll [C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin\cudnn_adv_infer64_8.dll]
2022/08/14 16:08 <SYMLINK> cudnn_adv_train64_8.dll [C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin\cudnn_adv_train64_8.dll]
2022/08/14 16:08 <SYMLINK> cudnn_cnn_infer64_8.dll [C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin\cudnn_cnn_infer64_8.dll]
2022/08/14 16:08 <SYMLINK> cudnn_cnn_train64_8.dll [C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin\cudnn_cnn_train64_8.dll]
2022/08/14 16:08 <SYMLINK> cudnn_ops_infer64_8.dll [C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin\cudnn_ops_infer64_8.dll]
2022/08/14 16:08 <SYMLINK> cudnn_ops_train64_8.dll [C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin\cudnn_ops_train64_8.dll]
2022/08/14 16:08 <SYMLINK> curand64_10.dll [C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin\curand64_10.dll]
2022/08/14 16:08 <SYMLINK> cusolver64_11.dll [C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin\cusolver64_11.dll]
2022/03/18 21:34 29,546 face1.jpg
2022/03/18 21:34 12,429 face2.jpg
2022/08/14 15:21 376 Issue.csproj
2021/03/27 20:28 1,114 Issue.sln
2021/02/16 01:33 729,940 mmod_human_face_detector.dat
2022/08/14 14:19 <SYMLINKD> models [D:\Works\OpenSource\FaceRecognitionDotNet.Models]
2022/08/14 15:21 <DIR> obj
2022/08/14 14:20 428 Program.BAK
2022/08/14 15:12 460 Program.cs
18 個のファイル 774,293 バイト
5 個のディレクトリ 836,606,410,752 バイトの空き領域
D:\Works\OpenSource\Temp\FaceRecognitionDotNet\17.#211>dotnet run -c Release
test
``` |
I've tried copying these dlls from the release folder to the solution folder (keeping them in both folders)... now the trainer runs again - but still no GPU utilization: Should I try using older versions of CUDA? I didn't do symlinks and do not have the models link (is it a problem?)
|
I have the same problem. Did you ever solve this problem? It takes life time to train the model with CPU. |
|
Just can't get the GPU work. I have tried almost every CUDA versions. I have now been training the model with CPU over 1,5 weeks and about 7% done :| This is just waste of resources. Could someone just share the trained models for age and gender. |
I'm trying to do the age training - now that the bug #206 was resolved. But it's very slow, after 24 hours I'm on step 83 and Epoch 1, at this rate the training will take about a two years to complete (to epoch 600)... Is it going to get faster? how long does the training suppose to last?
The text was updated successfully, but these errors were encountered: