Frequently Asked Questions about CUDA Programming

download Frequently Asked Questions about CUDA Programming

of 12

Transcript of Frequently Asked Questions about CUDA Programming

  • 7/23/2019 Frequently Asked Questions about CUDA Programming

    1/12

    FREQUENTLY ASKED QUESTIONS (FAQ)

    Q :cuda 5.0 linking with samples .h in Visual Studio 2010

    compiler : exception.h no such file or directory

    compiler : helper_string.h no such file or directory

    A :

    You are using visual studio 2010 so you should add the path to your project. Just right click

    on the name of the project, select properties. under configuration properties select VC++

    Directories. add an extra;at the end ofInclude DirectoriesandaddC:\ProgramData\NVIDIACorporation\CUDASamples\v5.\common\inc\. also

    thecommondirectory might also have ali!folder that you should add under"i!rar#

    Directories.

    You should do this for each project that needs them. also you can copy them to your VS

    directory underVC\include.

    Q :How to use printf in CUDA C?

    A :

    Add ---- cudaDeviceReset();------ atau ---- cudaDeviceS#nc$roni%e&';------- after Callingernel!

    "#a$ple:

    (include)stdio.$*

    ++glo!al++ void$elloCUDA&,oat-' print-&/0ello t$read 1d2 -31-\n/2 t$readId4.42 -';

    intmain&' $elloCUDA)))62 5***&6.7895-'; cudaDeviceeset&'; return;

    http://stackoverflow.com/questions/14907713/cuda-5-0-linking-with-samples-h-in-visual-studio-2010http://stackoverflow.com/questions/14907713/cuda-5-0-linking-with-samples-h-in-visual-studio-2010
  • 7/23/2019 Frequently Asked Questions about CUDA Programming

    2/12

    Q: %eneral wa& of solving "rror: 'tac around te varia*le +#+ was corrupted

    A:

    There are however, a somewhat smallish number of things that typically cause yourproblem:

    Improper handling of memory:

    Deleting something twice,

    Using the wrong type of deletion (-reefor something allocated withne, etc.),

    Accessing something after it's memory has been deleted.

    Returning a pointer or reference to a local.

    Reading or writing past the end of an array.

    Q : ,& fungsi checkCudaErrors$ili eadercuda! not D"."C."D/ toug 0 ave

    input 1include 2eadercuda!3?

    A : 0nput 1include 2eadercuda!3 at ver& 4otto$ 0nclude field!!

    Q : ,at is 05605" 7unction at C Code?

    A : 333 8ore Detail 3 6oo At 7ile 0nline at Ref 7older!

    inlineis generall& regarded as a int for te co$piler to do tis if it can!

    'o to su$$arise te options:

    Declare ever&ting static inline/ and ensure tat tere are no undefined

    functions/ and tat tere are no functions tat call undefined functions!

    Declare ever&ting inlinefor Studioand extern inlinefor gcc! .en provide a

    glo*al version of te function in a separate file!

    .e downside of inlining is tat it can *loat &our code si9e if te function is called

    fro$ $an& places!

    0n $an& places we create te functions for s$all worfunctionalit& wic contain si$ple

    and less nu$*er of e#ecuta*le instruction! 0$agine teir calling overead eac ti$e

    te& are *eing called *& callers!

    ,en a nor$al function call instruction is encountered/ te progra$ stores te $e$or&

    address of te instructions i$$ediatel& following te function call state$ent/ loads te

  • 7/23/2019 Frequently Asked Questions about CUDA Programming

    3/12

    function *eing called into te $e$or&/ copies argu$ent values/ u$ps to te $e$or&

    location of te called function/ e#ecutes te function codes/ stores te return value of

    te function/ and ten u$ps *ac to te address of te instruction tat was saved ust

    *efore e#ecuting te called function! .oo $uc run ti$e overead!

    .e C@ file in Ref 7older

    Declare te varia*le static to function error pro*le$!

    "#a$ple!

    0f

    "rror = error 65>@: Bvoid cdecl MereSor!(int /int /int)B (?

    8erge'ortEAFGAHH) alread& defined in $erge!cu!o* 7:I5ewI"sIJisual

    'tudio >=KIJ0'UA6 C1ICuda-C1I.estR5%I8erge'ort%GUI$ergeernel!cu!o*

    8erge'ort%GUL Appears/

    .en/

    Add static identifier to MereSor!7unction! 0f not ,oring/ 6MM at 706" 6iner .ools

    "rror 65>@ file!

  • 7/23/2019 Frequently Asked Questions about CUDA Programming

    4/12

    5ote! *ased on advice fro$ file

    c@/ alread& defined- - 'tac Mverflow (>=@-N->O P-N-@K A8)!t$

    (ttp:stacoverflow!co$uestions=ONO@error-ln>@-alread&-defined )"#tern onl& valid at varia*le declaration/ not 7unctions!

    Q : HM, .M convert list*o# ite$s to arra& integers C1

    A : 333ttp:stacoverflow!co$uestions=NP=K@=convert-list*o#-ite$s-to-arra&-

    integers-c-sarp

    Add.i? 3 Convert.i?can also work.

    It can also work like following:

    int>? ratingArra# 3 neint>num!eratingsInt?;-or&inti 3 ; i ) rating"ist=o4.Items.Count; i@@' ratingArra#>i? 3 Convert.

  • 7/23/2019 Frequently Asked Questions about CUDA Programming

    5/12

    Q :member names cannot be the same as their enclosing

    type C#

    A :Method names which are similar to class name are called constructors. Constructors

    dont have a return type.

    Change Class Name or Method Names

    Q :PInvokeStackImbalance C# call to unmanaged C++

    function

    A :As mentioned inDane Rose's comment, you can either use++stdcallon your C++

    function or declareCallingConvention 3 CallingConvention.Cdeclon yourDllImport.

    Q :Use of cudamalloc(). Why the double pointer?

    A :

    All CUDA API functions return an error code (or cudaSuccess if no error occured). All other

    parameters are passed by reference. However, in plain C you cannot have references, that's

    why you have to pass an address of the variable that you want the return information to be

    stored. Since you are returning a pointer, you need to pass a double-pointer.

    Another well-known function which operates on addresses for the same reason is

    thescan-function. How many times have you forgotten to write thisbefore the variable that

    you want to store the value to? ;)

    inti;scan-&/1d/2i';

    http://stackoverflow.com/questions/10070701/member-names-cannot-be-the-same-as-their-enclosing-type-c-sharphttp://stackoverflow.com/questions/10070701/member-names-cannot-be-the-same-as-their-enclosing-type-c-sharphttp://stackoverflow.com/questions/2390407/pinvokestackimbalance-c-sharp-call-to-unmanaged-c-functionhttp://stackoverflow.com/questions/2390407/pinvokestackimbalance-c-sharp-call-to-unmanaged-c-functionhttp://stackoverflow.com/questions/2390407/pinvokestackimbalance-c-sharp-call-to-unmanaged-c-functionhttp://stackoverflow.com/questions/2390407/pinvokestackimbalance-c-sharp-call-to-unmanaged-c-function/2738125#comment2825285_2738125http://stackoverflow.com/questions/7989039/use-of-cudamalloc-why-the-double-pointerhttp://stackoverflow.com/questions/7989039/use-of-cudamalloc-why-the-double-pointerhttp://stackoverflow.com/questions/10070701/member-names-cannot-be-the-same-as-their-enclosing-type-c-sharphttp://stackoverflow.com/questions/10070701/member-names-cannot-be-the-same-as-their-enclosing-type-c-sharphttp://stackoverflow.com/questions/2390407/pinvokestackimbalance-c-sharp-call-to-unmanaged-c-functionhttp://stackoverflow.com/questions/2390407/pinvokestackimbalance-c-sharp-call-to-unmanaged-c-functionhttp://stackoverflow.com/questions/2390407/pinvokestackimbalance-c-sharp-call-to-unmanaged-c-function/2738125#comment2825285_2738125http://stackoverflow.com/questions/7989039/use-of-cudamalloc-why-the-double-pointer
  • 7/23/2019 Frequently Asked Questions about CUDA Programming

    6/12

    0t is needed *ecause te function setste pointer! As wit ever& output para$eters in C/

    &ou need a pointer to an actual varia*le tat &ou set/ rater tan te value itself

    Q : ,at is Co$plete '&nta# of CUDA ernel

    A : ttp:cuda-progra$$ing!*logspot!co$!tr>=K=co$plete-s&nta#-of-cuda-

    ernels!t$l

    "#UKer$e% &'!h S!reas Ru$ !'e Ker$e% Lau$ch

    0n tis article weSll let &ou now te co$plete s&nta# of CUDA ernels!

    ,e all are love to learn and alwa&s curious a*out now ever&ting in detail!0 was ver& disappointed wen 0 was not a*le to find te co$plete s&nta# of

    http://cuda-programming.blogspot.com.tr/2013/01/complete-syntax-of-cuda-kernels.htmlhttp://cuda-programming.blogspot.com.tr/2013/01/complete-syntax-of-cuda-kernels.htmlhttp://cuda-programming.blogspot.com.tr/2013/01/complete-syntax-of-cuda-kernels.htmlhttp://cuda-programming.blogspot.com.tr/2013/01/complete-syntax-of-cuda-kernels.html
  • 7/23/2019 Frequently Asked Questions about CUDA Programming

    7/12

    CUDA ernels! 'o/ 0 toug let $e give it a da& to searc ever&were/ afterte ave& searc/ 0 found te s&nta# of CUDA ernel and toda& 0 a$presenting 0t &ou reader!

    .e CUDA ernel consist in 222 333 *racets four tings!

    7irst argu$ent is nown as %rid 'i9eL/ followed *& 4loc 'i9eL/ followed*& si9e of 'ared 8e$or&L and end wit 'trea$ argu$entL!

    Here is te co$plete s&nta#;

    Kernel_Name>(arguments,....)

    "r'd S'*e

    ,e all now wat is %rid si9e/ in case &ou donSt now read furter!

    %rid si9e is defined *& te nu$*er of *locs in a grid! 0n previous version ofCUDA arcitecture (fro$ Co$pute capa*ilit&=!# to >!#) te grid can onl& *eorgani9ed in two di$ension (F and E direction )! 4ut in te current version(fro$ Co$pute capa*ilit&K!# onwards) te grid can *e organi9ed in tree

    di$ension ( F / E and all )!

    +%ock S'*e

    .e *locs organi9ed in ter$s of treads! .reads is te s$allest unit inGarallel progra$$ing so in CUDA!

    Shared Meor, (SMEMS'*e)

    .is is for te si9e of sared $e$or& wic is to *e use in CUDA ernel forsared varia*le space! .is is use *ec! Mf d&na$ic sared $e$or& si9e inCUDA ernels!

    http://cuda-programming.blogspot.in/2013/01/what-is-compute-capability-in-cuda.htmlhttp://cuda-programming.blogspot.in/2013/01/what-is-compute-capability-in-cuda.htmlhttp://cuda-programming.blogspot.in/2013/01/what-is-compute-capability-in-cuda.htmlhttp://cuda-programming.blogspot.in/2013/01/what-is-compute-capability-in-cuda.html
  • 7/23/2019 Frequently Asked Questions about CUDA Programming

    8/12

    S!reas

    A strea$ is a seuence of operations tat are perfor$ed in order on tedevice!

    S!reas allows independent concurrent in-order ueues ofe#ecution! 'trea$ tell on wic device/ ernel will e#ecute!

    Mperations in different strea$s can *e interleaved and overlapped/ wiccan *e used to ide data transfers *etween ost and device!

    Q : ,at is 'trea$ in CUDA AG0

    A : ttp:cuda-progra$$ing!*logspot!in>=K=cuda-strea$s-wat-is-cuda-

    strea$s!t$l

    S!rea

    A strea$ is a seuence of operations tat are perfor$ed in order on tedevice!S!reas allows independent concurrent in-order ueues of e#ecution!

    TMperations in different strea$s can *e interleaved and overlapped/ wiccan *e used to ide data transfers *etween ost and device!

    Use cudaStream!reate() (ru$!'e A#I) or cuStream!reate()("ri#er $%&) to create a stream o- t#pecudaStream_t .

    T .e default strea$ (0DV) need not *e create!-- 8ultiple strea$s e#ist witin a single conte#t/ te& sare $e$or& and

    oter resources!-- Copies W ernel launces wit te sae strea$ para$eter e#ecute

    '$-order.

  • 7/23/2019 Frequently Asked Questions about CUDA Programming

    9/12

    Fu$c!'o$ #ro!o!,/eCreates a new as&ncronous strea$!

    cudaError0! cudaS!reaCrea!e (cudaS!rea0! 1 /S!rea)

    #arae!ers:/S!rea -Gointer to new strea$ identifier

    Re!ur$s:cuda'uccess/ cuda"rror0nvalidJalue

    5ote tat tis function $a& also return error codes fro$ previous/as&ncronous launces!

    Q : ,at is '.A.0C Jaria*les and 7unctions?

  • 7/23/2019 Frequently Asked Questions about CUDA Programming

    10/12

    A :

    Short answer ...it depends.

    1. Static defined local variables do not lose their value between function calls. In other

    words they are global variables, but scoped to the local function they are defined in.

    2. Static global variables are not visible outside of the C file they are defined in.

    3. Static functions are not visible outside of the C file they are defined in.

    Static member functions are functions that do not require an instance of the class, and are

    called the same way you access static member variables-- with the class name rather than a

    variable name. %.g. a3class::static3function%&4 rather than an3instance.function%&4& tatic member

    functions can only operate on static members, as they do not belong to specific instances of a class.

    tatic member functions can be used to modify static member variables to keep track of their values

    -- for instance, you might use a static member function if you chose to use a counter to give eachinstance of a class a uni5ue id.

    6 : *hy trange 7alues show in my ( / ()) (ode8

    :

    In rray 9 0 $ I;II

  • 7/23/2019 Frequently Asked Questions about CUDA Programming

    11/12

    value

    Value to !e set.

  • 7/23/2019 Frequently Asked Questions about CUDA Programming

    12/12

    Q : ,at is 33L or 22L $eans?

    A : ttp:www!cOlearn!co$c-progra$$ingc-*itwise-rigt-sift

    C Bitwise Right Shift : (>>) Operator=itise ig$t S$i-t Eperator in C

    1. It is denoted by >>

    2. Bit Pattern of the data can be shifted by specified number of

    Positions to Right

    3. When Data is Shifted Right , leading zeros are filled with zero.

    . Right shift !"erator is Binary Operator#Bi $ t%o&

    '. Binary (eans , Operator that require two arguments

    Fuic Evervie o- ig$t S$i-t Eperator'riginal Numer $ 66 66

    ig*t S*i+t - 6666

    /eading - Blanks eplaced !# 2S$on in D

    "irection o+ Mo#ement o+ "ata ig*t 00000000>>>>>>

    S#nta4 :+varia6le>>+num6er o@ places

    Q : How to using nvprof (5vidia Grofiling) at Co$$and Gro$pt ,indows?

    A : ttp:stacoverflow!co$uestions>=OP>=O>gpu-power-profiling-wit-nvprof-and-

    visual-profiler

    /C:\Program Giles\NVIDIA HPU Computing