Message boards :
News :
Astropulse 7,00 released for Linux 32&64, Win 32&64, Win32+AMD/NVIDIA/Intel GPU
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 35 · Next
Author | Message |
---|---|
Send message Joined: 15 Jul 05 Posts: 176 Credit: 1,674,830 RAC: 0 ![]() |
I got some errors on one Windows 7 x64 machine with AstroPulse v7 v7.00 (sse2) and AstroPulse v7 v7.00 (sse) Now I got my next AstroPulse v7 v7.00 (sse) result -> same behavior Exit status -226 (0xffffffffffffff1e) ERR_TOO_MANY_EXITS http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17308394 Next test is to reinstall boinc without service on that machine. @SusieQ: do you also have a service installation of the boinc client? Matthias |
Send message Joined: 14 Oct 05 Posts: 1137 Credit: 1,848,733 RAC: 0 ![]() |
I've unintentionally done a Beta test of a feature of Astropulse applications which is unlikely to be seen often, so here's some detail. While processing Task 17248384 with astropulse_7.00_windows_intelx86.exe, I also wanted to do some offline testing. I suspended other tasks so BOINC was running only that one CPU task, freeing the other CPU of that Core 2 Duo and also the GT 630 GPU. But the combination with offline testing caused a black screen crash of the system after ten minutes or so. After powering down and rebooting, when BOINC was started it indicated the boinc_task_state.xml in the slot for the active task had no project URL line. I suspended the task to check things out. In fact the file was the right size but all bytes were zeroed. In addition, the ap_state.dat file was zeroed in the same way and that's where the feature comes in. The application recognizes that the file doesn't have the needed checkpoint data so tries rereading it. If 100 tries don't get the needed data, the application restarts processing the task from the beginning. All of that worked as designed so eventually the task finished successfully, a positive result for the Beta test. Run time is long by 48 hours or so because it was around 72% complete before having to go back to the beginning. Of course Murphy's Law had to step in and decree that the wingmate's attempt with the OpenCL Intel GPU app version would do a probably false 30/30 overflow, but I'm confident the final judgement will validate my result. The SSE, SSE2, and OpenCL app versions use a different method of dealing with clobbered checkpoint files. Raistmer implemented a double checkpoint system with the files updated alternately, so if the most recent files are bad the previous set is available. But if the file system has clobbered both sets those app versions give up on the attempted restart and abort the task. Joe |
Send message Joined: 12 Nov 10 Posts: 1149 Credit: 32,460,657 RAC: 1 ![]() |
Matthias Yes, I'm running BOINC as a service. My non SSE/SSE2 work units are still running happily but have a way to go yet - they are between 60% and 70% complete. SusieQ |
Send message Joined: 14 Oct 05 Posts: 1137 Credit: 1,848,733 RAC: 0 ![]() |
Eric, on the Applications page, yesterday's update of Windows OpenCL app versions took the "Average processing" for intel_gpu and nvidia down to 0, but left the ati at 1395 GFLOPS. It isn't obvious whether the related app version stats pages had a similar effect. Would it be practical to have those stats pages show some additional details such as the average pfc and count of tasks included, or maybe date/times of inception and last update? Joe |
![]() Send message Joined: 15 Mar 05 Posts: 1547 Credit: 27,183,456 RAC: 0 ![]() |
Thanks. At some point David said he was going to add code to transfer app_version stats on version updates. It looks like it may only work some of the time. ![]() |
![]() ![]() Send message Joined: 25 Apr 07 Posts: 44 Credit: 30,057,505 RAC: 0 ![]() |
umm, is AP 7.01 for Inte£ GPU the same as 7.0 for Nvidia and ATI? I just got my first 7.01, but have to work through a 4 day queue... (and don't have access to my Nvidia computers, being in Thailand, on vaca…) ![]() |
Send message Joined: 14 Oct 05 Posts: 1137 Credit: 1,848,733 RAC: 0 ![]() |
umm, is AP 7.01 for Inte£ GPU the same as 7.0 for Nvidia and ATI? I just got my first 7.01, but have to work through a 4 day queue... (and don't have access to my Nvidia computers, being in Thailand, on vaca…) The Windows OpenCL 7.01 builds for all 3 GPU types are from the same code base. The primary change from the previous 7.00 builds is refinement of the default tuning parameters, with the dual goal of better stability on weak GPUs and slightly better performance on most GPUs. Joe |
Send message Joined: 2 Jul 13 Posts: 505 Credit: 5,019,318 RAC: 0 ![]() |
Seems there's a Too Much RFI Bug in the Linux SSE2 App? All my B3_P1s Erred out the same; <result> <name>ap_31au13ac_B3_P1_00416_20140713_20339.wu_2</name> <final_cpu_time>0.000000</final_cpu_time> <final_elapsed_time>1.074165</final_elapsed_time> <exit_status>136</exit_status> <state>3</state> <platform>x86_64-pc-linux-gnu</platform> <version_num>700</version_num> <plan_class>sse2</plan_class> <stderr_out> <![CDATA[ <message> process got signal 8 </message> <stderr_txt> Not using ap_cmdline.txt-file, using commandline options. AstroPulse v7.00 Linux 64 bit, Rev 2438 V7 match, by Raistmer with support of Lunatics.kwsn.net team. ffa threshold, twindechirp, lrint mods by Joe Segur Build features: Non-graphics SMALL_CHIRP_TABLE FFTW BLANKIT USE_INCREASED_PRECISION SSE2 64bit System: Linux x86_64 Kernel: 3.2.0-65-generic CPU : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz 4 core(s), Speed : 2667.000 MHz L1 : 64 KB, Cache : 3072 KB state.fold_buf_size_short=65536; state.fold_buf_size_long=262144 In ap_remove_radar.cpp: get_indices_to_randomize: num_ffts_forecast < 100. Blanking too much RFI? percent blanked: 100.00 </stderr_txt> ]]> </stderr_out> <ready_to_report/> :-( Someone else had the same problem, Error tasks for computer 51991 |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
SIGFPE 8 Core Floating point exception News about SETI opt app releases: https://twitter.com/Raistmer |
Send message Joined: 29 May 06 Posts: 1037 Credit: 8,440,339 RAC: 0 ![]() |
Seems there's a Too Much RFI Bug in the Linux SSE2 App? All my B3_P1s Erred out the same; One Wu on my Ubuntu C2D T8100 Laptop ended the same with the SSE2 app: http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6493143 Claggy |
![]() Send message Joined: 18 Jan 06 Posts: 1038 Credit: 18,734,730 RAC: 0 ![]() |
SIGFPE 8 Core Floating point exception Found that in one function printing the counters a check for zero was missing. _\|/_ U r s |
Send message Joined: 2 Jul 13 Posts: 505 Credit: 5,019,318 RAC: 0 ![]() |
Is it fixed yet? I still have 3 suspended B3_P1s waiting on a successful completion. Does the regular CPU App have the same problem? I could remove the plan class and let the other App run them. Or just wait. I'm waiting... |
Send message Joined: 22 Sep 09 Posts: 19 Credit: 906,325 RAC: 0 ![]() |
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6487859 AMD AMD Radeon HD 7870/7950/7970/R9 280X series (Tahiti) (3072MB) driver: 1.4.1848 OpenCL: 1.2 (Radeon 7950) Drivers: Cat 14.4 Stderr output <core_client_version>7.2.39</core_client_version> <![CDATA[ <stderr_txt> Running on device number: 0 Priority of worker thread raised successfully Priority of process adjusted successfully, below normal priority class used OpenCL platform detected: Advanced Micro Devices, Inc. BOINC assigns device 0 Info: BOINC provided OpenCL device ID used Used GPU device parameters are: Number of compute units: 28 Single buffer allocation size: 256MB max WG size: 256 Build features: Non-graphics BLANKIT OpenCL TWIN_FFA OCL_ZERO_COPY COMBINED_DECHIRP_KERNEL FFTW USE_INCREASED_PRECISION USE_SSE2 x86 CPUID: Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz Cache: L1=64K L2=256K CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 AVX AstroPulse v7 Windows x86 rev 2488, V7 match, by Raistmer with support of Lunatics.kwsn.net team. SSE2 OpenCL version by Raistmer oclFFT fix for ATI GPUs by Urs Echternacht ffa threshold mods by Joe Segur SSE3 dechirping by JDWhale Combined dechirp kernel by Frizz Number of OpenCL platforms: 1 OpenCL Platform Name: AMD Accelerated Parallel Processing Number of devices: 1 Max compute units: 28 Max work group size: 256 Max clock frequency: 800Mhz Max memory allocation: 1073741824 Cache type: Read/Write Cache line size: 64 Cache size: 16384 Global memory size: 3221225472 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Queue properties: Out-of-Order: No Name: Tahiti Vendor: Advanced Micro Devices, Inc. Driver version: 1445.5 (VM) Version: OpenCL 1.2 AMD-APP (1445.5) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event state.fold_buf_size_short=65536; state.fold_buf_size_long=262144 INFO: can't open binary kernel file: C:\Games/projects/setiweb.ssl.berkeley.edu_beta\AstroPulse_Kernels_r2488.cl_Tahiti.bin_V6_TWIN_FFA_14455VM, continue with recompile... INFO: binary kernel file created WARNING: can't open binary kernel file for oclFFT plan: C:\Games/projects/setiweb.ssl.berkeley.edu_beta\AP_clFFTplan_Tahiti_32768_r2488.bin_14455VM, continue with recompile... single pulses: 3 repetitive pulses: 0 percent blanked: 0.00 Single pulse: peak_power=38.03 dm=-5404 fft_num=13713408 peak_bin=13719860 scale=2 Single pulse: peak_power=87.11 dm=-5958 fft_num=4849664 peak_bin=4856896 scale=5 Single pulse: peak_power=366.9 dm=-12499 fft_num=32882688 peak_bin=32892928 scale=8 class T_remove_radar: total=1.52e+009, N=1, <>=1.52e+009, min=1.52e+009, max=1.52e+009 class T_main_loop_L1: total=4.68e+012, N=111, <>=4.21e+010, min=3.97e+010, max=4.70e+010 class T_FFT_forward: total=9.90e+009, N=909312, <>=1.09e+004, min=6.47e+003, max=4.19e+007 class T_remove_radar_randomize: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_build_chirp_table: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_DataWrite: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_DataWrite_ns: total=0, N=0, <>=0, min=0 max=0 class T_oclReadBuf: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_ChirpWrite: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_ChirpWrite_ns: total=0, N=0, <>=0, min=0 max=0 class T_dechirp: total=1.16e+010, N=909312, <>=1.28e+004, min=9.12e+003, max=3.90e+007 class Dechirp_ns: total=0, N=0, <>=0, min=0 max=0 class Half_ns: total=0, N=0, <>=0, min=0 max=0 class T_PC_single_pulse_kernel_FFA_update: total=3.76e+012, N=909312, <>=4.13e+006, min=3.55e+006, max=4.47e+008 class PC_ns: total=0, N=0, <>=0, min=0 max=0 class T_oclReadBuf: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_oclWriteBuf: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_FFT_inverse: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_ffa: total=8.41e+011, N=999, <>=8.41e+008, min=3.62e+008, max=4.82e+009 class T_GPU_buffer_read_backs: total=8, N=8, <>=1, min=1 max=1 TWIN_FFA OCL_ZERO_COPY USE_OPENCL OPENCL_WRITE USE_INCREASED_PRECISION SMALL_CHIRP_TABLE COMBINED_DECHIRP_KERNEL BLANKIT rev 2488 GPU device synched 20:50:00 (1960): called boinc_finish </stderr_txt> ]]> seems to be working or how? |
Send message Joined: 2 Jul 13 Posts: 505 Credit: 5,019,318 RAC: 0 ![]() |
SIGFPE 8 Core Floating point exception I became inpatient, I couldn't download any new tasks with the tasks suspended. I removed the plan class and had the base CPU App run them. No problems with the base CPU App. Note the Computer that had the tasks before they were sent to me; http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6493066 http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6494446 http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6495149 Any idea when the Linux ATI AP App will be released? |
Send message Joined: 15 Jul 05 Posts: 176 Credit: 1,674,830 RAC: 0 ![]() |
Matthias Changed the BOINC installation on one machine from service to standard. now I can run also AstroPulse v7 v7.01 (opencl_intel_gpu_100) results. Actual I don't get any sse and sse2 results on my machine (got to much AstroPulse v7 v7.00 ), but yoyo ows is now running successfully, here I had the same restart problem. Regarding AstroPulse v7 v7.01 (opencl_intel_gpu_100) looks like there is also the problem with the false finding of pulses single pulses: 30 repetitive pulses: 1 Actual I would say 50% of the opencl_intel_gpu_100 results show to much pulses. http://setiweb.ssl.berkeley.edu/beta/results.php?hostid=72058&offset=0&show_names=0&state=0&appid=29 Edit: correct the quote Matthias |
![]() Send message Joined: 18 Jan 06 Posts: 1038 Credit: 18,734,730 RAC: 0 ![]() |
Any idea when the Linux ATI & NV AP App will be released? Assumption : If Eric has time. _\|/_ U r s |
Send message Joined: 16 May 06 Posts: 150 Credit: 136,942 RAC: 0 ![]() |
I just realized I posted this message in the wrong place - should have been here. |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
I reproduced this issue on own host. It exist for BOINC 7.2.42 and doesn't exist for BOINC 6.10.60 So, if only CPU apps involved I recommend to downgrade for now. Also, current BOINC service install unable to handle GPU work on modern OSes so user-mode install required for GPU apps anyway. I got some errors on one Windows 7 x64 machine with AstroPulse v7 v7.00 (sse2) and AstroPulse v7 v7.00 (sse) News about SETI opt app releases: https://twitter.com/Raistmer |
Send message Joined: 3 Jan 07 Posts: 1451 Credit: 3,272,268 RAC: 0 ![]() |
I reproduced this issue on own host. See my reply to your boinc_alpha report. It appears that your BOINC API tree may be missing commit 3aaeadaf99669c6460a183e7e9f2063f39152031 (and possibly others too). Edit - And see Jocob Klein's confirmation. |
Send message Joined: 14 Oct 05 Posts: 1137 Credit: 1,848,733 RAC: 0 ![]() |
Eric, I confess I have muddied your completion stats for AstroPulse v7 v7.01 (opencl_nvidia_100) slightly... There has long [1] been an occasional problem with the BOINC API code crashing when boinc_finish() is called from the OpenCL apps, my task 17351181 did that overnight. Because I'm on dial-up BOINC's network activity was off, and checking the outfile for that task showed it was fine as expected. I decided to change the result status in BOINC's client_state.xml so it would be uploaded and validated. That worked as I planned, demonstrating that the application code had done the work correctly and it's BOINC API code which faults. If it happens again I promise to let BOINC trash the work. Joe [1] SETI@home main project thread OpenCL AstroPulse crash after processing completion - write here. {edit} So when I checked status just after posting I found the problem had happened again on task 17351302. As promised, I let BOINC trash it. Joe |
©2023 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.