@SYSTEM_TOMBSTONE@1449222028760.txt pid: 10466, tid: 10493, name: android.bg >>> system_server <<< signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xa00000070 x0 000000557e996710 x1 x2 0000007f79c31ba0 x3 0000000000000000 x4 000000006fc46a80 x5 0000000000000001 x6 0000000000000000 x7 000000557e4f527c x8 0000000000000000 x9 000000557e4f5278 x10 0000000000000000 x11 0000000000000000 x12 0000000000000000 x13 0000000000430000 x14 0000000000550000 x15 0000000000430000 x16 0000007f8f640320 x17 0000007f8eff4f6c x18 0000007f8c1d0470 x19 000000000000000a x20 0000007f8f590e9c x21 000000557e9b14d0 x22 000000001354bf40 x23 000000006fdf61f0 x24 0000007f79c31b20 x25 0000000012da2040 x26 0000000000000000 x27 00000000000f52be x28 0000000000000000 x29 0000000000358c82 x30 0000000072639c64 sp 0000007f79c31680 pc 0000007f8eff4f70 pstate 0000000080000000 backtrace: #00 pc 0000000000029f70 /system/lib64/libbinder.so (android::IPCThreadState::flushCommands() 4) #01 pc 0000000000009c60 /data/dalvik-cache/arm64/system@framework@boot.oat
查看汇编码: $ aarch64-linux-android-objdump -D symbols/system/lib64/libbinder.so 0000000000029f6c <_ZN7android14IPCThreadState13flushCommandsEv>: 29f6c: f9400001 ldr x1, [x0] => 29f70: b9400822 ldr w2, [x1,#8] 29f74: 6b1f005f cmp w2, wzr 29f78: 5400006d b.le 29f84 <_ZN7android14IPCThreadState13flushCommandsEv 0x18> 29f7c: 52800001 mov w1, #0x0 // #0 29f80: 17ffd9ec b 20730 <_ZN7android14IPCThreadState14talkWithDriverEb@plt> 29f84: d65f03c0 ret
x1值是x从0地址中获取的,0x0000000a00000068显然不是合法地址。x0地址被覆盖。
查看x0附近的内存值:
memory near x0: 000000557e9966f0 0000007f8f01e748 0000000000000000 000000557e996700 0000000000000000 0000000000000007 000000557e996710 0000007f0000000a 000000557e996720 00000438234cda2a 3de6de183dc20f78 000000557e996730 000000003d9e2680 0000000000000008 000000557e996740 0000000000000000 0000000000000000 000000557e996750 0000000000000000 0000000000000000 000000557e996760 0000000000010001 0000000000000000 000000557e996770 000000557e996840 000000557e996780 000000550000000a 000004382647caaa 000000557e996790 3de6de183dc20f78 000000003d9e2680 000000557e9967a0 0000000000000000 0000000000000000 000000557e9967b0 0000000000000000 0000000000000000 000000557e9967c0 0000007f8e010001 0000000000000000 000000557e9967d0 0000000000000000 000028e200000040 000000557e9967e0 000000550000000a
找出明显的规律:
0x0000000a0000068多次出现,间隔为13*8=104字节,每个块的结构非常相似,这个数据很可能是一个结构体数组。
@SYSTEM_TOMBSTONE@1449455808867.txt pid: 5992, tid: 6053, name: AlarmManager >>> system_server <<< signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x553d89e3c0 x0 00000055837f4dc0 x1 0000000000000004 x2 0000000000000440 x3 0000000000020000 x4 0000000000000444 x5 000000553d89df80 x6 0000000000000000 x7 000000558336b27c x8 0000000000000000 x9 000000558336b278 x10 0000000000000000 x11 0000000000000000 x12 0000000000000000 x13 0000000000430000 x14 0000000000550000 x15 0000000000430000 x16 0000007f7e502478 x17 0000007f7e4dbb74 x18 0000007f7b6b0470 x19 00000055837f4dc0 x20 000000008080005c x21 0000005583838590 x22 000000008080005c x23 000000006fd15b08 x24 000000008080005c x25 000000003000003a x26 0000000012d5fe80 x27 0000000012c47400 x28 000000000000005c x29 0000007f68090bd0 x30 0000007f7ea66e10 sp 0000007f68090bd0 pc 0000007f7e4dbbc4 pstate 0000000080000000 backtrace: #00 pc 0000000000030bc4 /system/lib64/libbinder.so (int android::Parcel::writeAligned<int>(int) 80) #01 pc 00000000000d3e0c /system/lib64/libandroid_runtime.so #02 pc 0000000000109630 /data/dalvik-cache/arm64/system@framework@boot.oat
$ aarch64-linux-android-objdump -D symbols/system/lib64/libbinder.so 0000000000030b74 <_ZN7android6Parcel12writeAlignedIiEEiT_>: 30b74: a9be7bfd stp x29, x30, [sp,#-32]! 30b78: 910003fd mov x29, sp 30b7c: a90153f3 stp x19, x20, [sp,#16] 30b80: aa0003f3 mov x19, x0 30b84: 2a0103f4 mov w20, w1 30b88: f940002 ldr x2, [x0,#32] 30b8c: f9400c03 ldr x3, [x0,#24] 30b90: 91001044 add x4, x2, #0x4 30b94: eb03009f cmp x4, x3 30b98: 54000109 b.ls 30bb8 <_ZN7android6Parcel12writeAlignedIiEEiT_+0x44> 30b9c: d2800081 mov x1, #0x4 // #4 30ba0: 97ffbbe4 bl 1fb30 <_ZN7android6Parcel8growDataEm@plt> 30ba4: 34000080 cbz w0, 30bb4 <_ZN7android6Parcel12writeAlignedIiEEiT_+0x40> 30ba8: a94153f3 ldp x19, x20, [sp,#16] 30bac: a8c27bfd ldp x29, x30, [sp],#32 30bb0: d65f03c0 ret 30bb4: f9401262 ldr x2, [x19,#32] 30bb8: f9400665 ldr , [,#8] 30bbc: aa1303e0 mov x0, x19 30bc0: d2800081 mov x1, #0x4 // #4 => 30bc4: b82268b4 str w20, [,x2] 30bc8: a94153f3 ldp x19, x20, [sp,#16] 30bcc: a8c27bfd ldp x29, x30, [sp],#32 30bd0: 17ffbf3c b 208c0 <_ZN7android6Parcel11finishWriteEm@plt>
NE的原因是x5值非法,而x5是从x19中取出来的,查看x19值
memory near x19: 00000055837f4da0 0000000200000004 0000000a00000068 00000055837f4db0 000000550000000a 0000015263c04a64 00000055837f4dc0 3d5762703e10ca1c 000000553d89df80 00000055837f4dd0 0000000000000440 0000000000020000 00000055837f4de0 0000000000000440 0000000000000000 00000055837f4df0 0000000000000000 0000000000000000 00000055837f4e00 0000000000000000 0000000000010001 00000055837f4e10 0000000a00000068 000000550000000a 00000055837f4e20 0000015266bb3ae4 3d5762703e10ca1c 00000055837f4e30 0000007f3d89df80 000000558379c020 00000055837f4e40 000000558382ea70 0000656c62617300 00000055837f4e50 0000000000000000 00000000000000d3 00000055837f4e60 0000007f7eb1dc28 0000000000000000 00000055837f4e70 0000007f7c8f2348 0000000a00000068 00000055837f4e80 000000020000000a 0000015269b62b64 00000055837f4e90 3d5762703e10ca1c 000000003d89df80
被破坏的现场合前面一个现场几乎相同!
后面几个就直接在tombstone里搜索0000000a00000068,发现每一个都是有0000000a00000068, 且间隔都是104字节。
至此基本上能确定就是这个大小为104字节,带有0x0000000a00000068这个pattern的结构体数组覆盖正常内存导致的。
下一步就是要确定这个0x0000000a00000068属于哪个结构体。
从tombstone的maps数据中可以知道每次出现问题的地址都是堆内存,因此覆盖和被覆盖的内存也应该都是malloc出来的。
@SYSTEM_TOMBSTONE@1449229865122.txt ... 00000055996ba000-0000005599d24fff rw- 6729728 [heap] ...
代码如下:
#if defined(__aarch64__) extern "C" void free(void* p); extern "C" void inject_free(void* p) { if(p != NULL) { size_t* head = (size_t*)((char*)p-sizeof(size_t)); //这里通过heap chunk的head信息得出指针的大小 size_t size = *head & ~7,i=0; int* p_int = (int*)p; while(i < size/sizeof(int)) { if(*(p_int+i)== 0x00000068 && *(p_int+i+1)==0x0000000a ) { //这里判断内存中是否有pattern LOGD("size=%llu",size); size_t j = 0; while(j < size/sizeof(int)) { LOGD("p_int[%d]=%08x",j,*(p_int+j)); j++; } dump_java_stack(); dump_native_stack(); } i++; } } free(p); //这里再调用真正的free } #endif
# logcat |grep INJECT D/INJECT ( 912): size=102352 D/INJECT ( 912): p_int[0]=00000068 D/INJECT ( 912): p_int[1]=0000000a D/INJECT ( 912): p_int[2]=0000000a D/INJECT ( 912): p_int[3]=00000055 D/INJECT ( 912): p_int[4]=0eaa916c D/INJECT ( 912): p_int[5]=0000493a D/INJECT ( 912): p_int[6]=bdaadb00 D/INJECT ( 912): p_int[7]=bcd10f00 D/INJECT ( 912): p_int[8]=be057140 D/INJECT ( 912): p_int[9]=00000000 ... D/INJECT ( 912): p_int[25586]=00007041 D/INJECT ( 912): p_int[25587]=00000000 D/INJECT ( 912): #00 pc 0000000000000a24 /system/lib64/libinjectapis.so (inject_free+236) D/INJECT ( 912): #01 pc 000000000000fc28 /system/lib64/libsensorservice.so D/INJECT ( 912): #02 pc 000000000000fcf4 /system/lib64/libsensorservice.so D/INJECT ( 912): #03 pc 00000000000141b4 /system/lib64/libutils.so (android::RefBase::decStrong(void const*) const+560) D/INJECT ( 912): #04 pc 0000000000029858 /system/lib64/libbinder.so (android::IPCThreadState::processPendingDerefs()+140) D/INJECT ( 912): #05 pc 000000000002a734 /system/lib64/libbinder.so (android::IPCThreadState::joinThreadPool(bool)+68) D/INJECT ( 912): #06 pc 0000000000031d68 /system/lib64/libbinder.so D/INJECT ( 912): #07 pc 00000000000179c0 /system/lib64/libutils.so (android::Thread::_threadLoop(void*)+188) D/INJECT ( 912): #08 pc 000000000009277c /system/lib64/libandroid_runtime.so (android::AndroidRuntime::javaThreadShell(void*)+96) D/INJECT ( 912): #09 pc 0000000000017224 /system/lib64/libutils.so D/INJECT ( 912): #10 pc 000000000001cbb0 /system/lib64/libc.so (__pthread_start(void*)+52) D/INJECT ( 912): #11 pc 0000000000019044 /system/lib64/libc.so (__start_thread+16)
看来是抓到这个罪魁祸首了!
用addr2line看看是哪个文件哪一行:
$ aarch64-linux-android-addr2line -f -e symbols/system/lib64/libsensorservice.so fc28 _ZN7android13SensorService21SensorEventConnectionD1Ev /home/mi/disk/2-v6-l-hermes-dev/frameworks/native/services/sensorservice/SensorService.cpp:1021
@frameworks/native/services/sensorservice/SensorService.cpp SensorService::SensorEventConnection::~SensorEventConnection() { ALOGD_IF(DEBUG_CONNECTIONS, "~SensorEventConnection(%p)", this); mService->cleanupConnection(this); if (mEventCache != NULL) { delete mEventCache; } }
就是delete mEventCache语句释放带有0000000a00000068的内存,看看它是被创建的代码:
@frameworks/native/services/sensorservice/SensorService.cpp status_t SensorService::SensorEventConnection::sendEvents( sensors_event_t const* buffer, size_t numEvents, sensors_event_t* scratch, SensorEventConnection const * const * mapFlushEventsToConnections) { ... mEventCache = new sensors_event_t[mMaxCacheSize];