首页 > 图书中心 >图书详情

大规模图数据的高效计算关键技术研究

前沿性、系统性、可读性 深入专题研究领域的阶梯 进入交叉学科的桥梁 启迪研发创新的源泉

作者:章明星
定价:89
印次:1-1
ISBN:9787302542537
出版日期:2020.05.01
印刷日期:2020.04.15

由于具有良好的表达能力,图数据结构被广泛用来对元素间具有复杂联系的数据进行建模,如社交网络、知识图谱等。因此,可以对大规模图数据进行分析的处理技术逐渐成为当前学术界和业界的热门研究话题之一。目前,已有为数众多的图计算系统被提出和应用,并取得了巨大的商业成功。本书通过将不同环境下图计算系统的数据载入途径分为四个阶段分别进行了研究,总结出了一系列的优化方法,可为相关研究人员提供参考。

more >

瀵煎笀搴忚█ 鐢变簬鍏锋湁鑹ソ鐨勮〃杈捐兘鍔涳紝鍥炬暟鎹粨鏋勮骞挎硾鐢ㄦ潵瀵瑰厓绱犻棿鍏锋湁澶嶆潅鑱旂郴鐨勬暟鎹繘琛屽缓妯°€傚洜姝わ紝鍙互瀵瑰ぇ瑙勬ā鍥炬暟鎹繘琛屽垎鏋愮殑澶勭悊鎶€鏈€愭笎鎴愪负褰撳墠瀛︽湳鐣屽拰涓氱晫鐨勭儹闂ㄧ爺绌惰棰樸€傚凡鏈変负鏁颁紬澶氱殑鍥捐绠楃郴缁熻鎻愬嚭鍜屽簲鐢紝骞跺彇寰椾簡宸ㄥぇ鐨勫晢涓氭垚鍔熴€傚湪鍓嶄汉鐨勫熀纭€涓婏紝鏈功浣滆€呯珷鏄庢槦鍗氬+鎸佺画鍒涙柊锛岄€氳繃涓嶆柇鍦颁紭鍖栧浘鏁版嵁鍦ㄥ悇绉嶄笉鍚屽満鏅笅鐨勮浇鍏ラ€熷害锛屽湪澶氫釜鏂瑰悜涓婇兘鍙栧緱浜嗛噸瑕佹垚鏋滐紝骞跺湪 OSDI銆丄SPLOS銆乂LDB銆丄TC銆丠PCA銆両CS绛夊浗闄呴珮姘村钩浼氳涓婂彂琛ㄤ簡澶氱瘒璁烘枃銆傛澶栦粬鐨勫崥澹浣嶈鏂囪繕鑾疯瘎 ACM SIGSOFT鏉板嚭璁烘枃锛屾竻鍗庡ぇ瀛︿紭绉€鍗氬+瀛︿綅璁烘枃锛屽寳浜競浼樼鍗氬+瀛︿綅璁烘枃锛?IEEE TCSC鍗撹秺濂栵紙浼樼鍗氬+瀛︿綅璁烘枃锛夈€? 鏇撮噸瑕佺殑鏄紝绔犳槑鏄熷崥澹湪鐮旂┒鍥捐绠楄繖涓€棰嗗煙鐨勮繃绋嬩腑鎬荤粨鍑轰簡涓€鏁村鐨勭郴缁熶紭鍖栨柟娉曘€備粬閫氳繃娣卞叆鍒嗘瀽锛屾牴鎹浘璁$畻鏈韩鍏锋湁鏁版嵁灞€閮ㄦ€у樊銆佸崟涓偣 /杈圭殑璁$畻寮€閿€灏忕殑鐗圭偣锛屽彂鐜板叾鎬ц兘鐨勪富瑕佺摱棰堝湪浜庡浘鏁版嵁鐨勮浇鍏ャ€傚熀浜庤繖涓€鍙戠幇锛岀珷鏄庢槦鍗氬+灏嗕笉鍚屽満鏅笅鐨勫浘璁$畻浼樺寲缁熶竴鎴愪竴濂椾竴鑷寸殑浼樺寲鎬濊矾锛屽嵆灏嗘暣涓垎甯冨紡绯荤粺鎯宠薄鎴愪竴涓闃剁殑浣撶郴缁撴瀯 (Cache/PIM鈫掑唴瀛?鈫掔鐩?/缃戠粶 )锛岀劧鍚庨€氳繃浼樺寲姣忎袱灞備箣闂寸殑灞€閮ㄦ€ф潵鎻愬崌鏁翠綋鐨勮繍琛屾晥鐜囥€傞€氳繃杩欎竴鎬濊矾锛屽湪骞惰鍥捐绠椼€佸崟鏈哄唴瀛樺浘璁$畻銆佸崟鏈哄瀛樺浘璁$畻銆佸瓨绠楄瀺鍚堝姞閫熺瓑澶氫釜鍦烘櫙涓嬭繘琛屼簡閽堝杞藉叆鐡堕鐨勭粏鑷翠紭鍖栵紝鍥犺€岄兘鍙栧緱浜嗚緝澶х殑鎬ц兘鎻愬崌銆? 鏈功棣栧厛鎻忚堪浜嗙幇鏈夌殑鍥捐绠楃郴缁熶富瑕佸熀浜庝竴浜涚畝鍗曞寲鍋囪瀹炵幇杩欎竴鐜拌薄锛屽鐐规潈涓嶅彲鍒嗗壊銆佸崟涓绠楁搷浣滃彲浠ュ绔嬪湴鎵ц绛夛紝鍥犳寰堥毦杈惧埌涓嬪眰纭欢鎵€鑳芥敮鎸佺殑鏈€楂樿绠楁晥鐜囥€備负瑙e喅杩欎竴闂锛屼綔鑰呴€氳繃鍒嗘瀽鍙戠幇鍥捐绠楃殑涓昏鏁堢巼鐡堕鍦ㄤ簬鏁版嵁杞藉叆閫熷害锛屼簬鏄皢涓嶅悓鐜涓嬪浘璁$畻绯荤粺鐨勬暟鎹浇鍏ラ€斿緞鍒嗕负鍥涗釜闃舵鍒嗗埆杩涜浜嗙爺绌躲€傚叾涓昏鍒涙柊鎴愭灉鍖呮嫭锛氣憼鎻愬嚭浜嗕竴绉嶄笁缁村浘璁$畻搴旂敤浠诲姟鍒掑垎鏂规硶銆傝鏂规硶鍩轰簬鏁版嵁鍥句腑鐐规潈鍙繘涓€姝ュ垝鍒嗚繖涓€鍙戠幇锛屾渶楂樺彲浠ュ噺灏?90.6%鐨勯€氳閲忥紝杈炬垚 4.7鍊嶇殑鎻愰€熴€傝繖涓€鎴愭灉鍙戣〃浜?OSDI 2016锛屼负璇ヤ細璁笂骞跺垪棣栫瘒浠ュ浗鍐呭ぇ瀛︿负绗竴鍗曚綅涓旀湁鍥藉唴澶у鏁欏笀缃插悕鐨勮鏂囥€傗憽鎻愬嚭浜嗕竴绉嶅垎灞傜殑鍥炬暟鎹粍缁囨牸寮忋€傞€氳繃鍦ㄥ瀛樿澶囦笂鍒嗗眰瀛樺偍鍥炬暟鎹紝鏈€楂樺彲杈?6.4鍊嶇殑鍔犻€熸瘮銆傗憿鎻愬嚭浜嗕竴绉嶇煩闃靛浘璁$畻寮曟搸鐨勮嚜鍔ㄤ紭鍖栫畻娉曘€傝绠楁硶涓昏鍩轰簬寰幆铻嶅悎浼樺寲鐨勫師鐞嗭紝骞跺悓鏃惰€冭檻浜嗗垎甯冨紡鐜涓嬪叧浜庢暟鎹竴鑷存€х殑瑕佹眰锛屾渶楂樺彲灏嗗師绋嬪簭鍔犻€?5.8鍊嶃€傗懀鎻愬嚭浜嗕竴绉嶉拡瀵规柊鍨嬪瓨绠楄瀺鍚堝櫒浠剁殑鍥捐绠楁ā鍨嬨€傞拡瀵瑰瓨绠楄瀺鍚堣繖涓€鍏ㄦ柊鐨勬敮鎸佺洿鎺ュ湪鍐呭瓨鍣ㄤ欢涓婅繘琛岃绠楃殑浣撶郴缁撴瀯锛屾彁鍑轰簡涓庝箣鐩搁€傞厤鐨勬柊鍨嬪浘璁$畻妯″瀷锛屾渶楂樺彲浠ュ噺灏戣繎 95%鐨勯€氳閲忋€? 鎽樿 鐢变簬鍏锋湁鑹ソ鐨勮〃杈捐兘鍔涳紝鍥炬暟鎹粨鏋勮骞挎硾鐢ㄦ潵瀵瑰厓绱犻棿鍏锋湁澶嶆潅鑱旂郴鐨勬暟鎹繘琛屽缓妯★紝濡傜ぞ浜ょ綉缁溿€佺煡璇嗗浘璋辩瓑銆傞殢鐫€淇℃伅鍖栨妧鏈殑杩唬鏇存柊鍜屼簰鑱旂綉搴旂敤鐨勮摤鍕冨彂灞曪紝鍙互瀵瑰ぇ瑙勬ā鍥炬暟鎹繘琛屽垎鏋愮殑澶勭悊鎶€鏈€愭笎鎴愪负褰撳墠瀛︽湳鐣屽拰涓氱晫鐨勭儹闂ㄧ爺绌惰棰樹箣涓€銆傚凡鏈変负鏁颁紬澶氱殑鍥捐绠楃郴缁熻鎻愬嚭鍜屽簲鐢紝骞跺彇寰椾簡宸ㄥぇ鐨勫晢涓氭垚鍔熴€? 鐒惰€岋紝鐜版湁鐨勫浘璁$畻绯荤粺涓昏鍩轰簬涓€浜涚畝鍗曞寲鍋囪瀹炵幇锛屽鐐规潈涓嶅彲鍒嗗壊銆佸崟涓绠楁搷浣滃彲浠ュ绔嬪湴鎵ц绛夛紝鍥犳寰堥毦杈惧埌涓嬪眰纭欢鎵€鑳芥敮鎸佺殑鏈€楂樿绠楁晥鐜囥€備负瑙e喅杩欎竴闂锛屾湰涔﹂€氳繃鍒嗘瀽锛屽彂鐜板ぇ閲忓浘璁$畻搴旂敤鐨勪富瑕佹晥鐜囩摱棰堝湪浜庢暟鎹浇鍏ラ€熷害锛屼簬鏄皢涓嶅悓鐜涓嬪浘璁$畻绯荤粺鐨勬暟鎹浇鍏ラ€斿緞鍒嗕负鍥涗釜鏂归潰鍒嗗埆杩涜浜嗙爺绌躲€傛湰涔︾殑涓昏鍒涙柊鎴愭灉濡備笅銆? 锛?锛夋彁鍑轰簡涓€绉嶄笁缁村浘璁$畻搴旂敤浠诲姟鍒掑垎鏂规硶銆傝鏂规硶鍩轰簬鏁版嵁鍥句腑鐐规潈鍙繘涓€姝ュ垝鍒嗚繖涓€鍙戠幇锛屾嫇灞曚簡涓€涓叏鏂扮殑浠诲姟鍒掑垎缁村害銆備笌浼犵粺鐨勪竴缁村拰浜岀淮鍒掑垎鏂规硶涓嶅悓锛屼笁缁村垝鍒嗘柟娉曞厑璁稿皢鏁版嵁鍥句腑鐨勭偣杩涗竴姝ュ垝鍒嗕负瀛愮偣锛屽苟鍒嗛厤缁欎笉鍚岀殑璁$畻鑺傜偣銆傛祴璇曠粨鏋滄樉绀猴紝涓夌淮鍒掑垎鏂规硶鏈€楂樺彲浠ュ噺灏?90.6%鐨勯€氳閲忥紝浠庤€岃揪鎴愭彁鍗囨暣浣撹繍琛屾晥鐜囩殑鐩殑銆? 锛?锛夋彁鍑轰簡涓€绉嶅垎灞傜殑鍥炬暟鎹粍缁囨牸寮忋€傞€氳繃鍦ㄥ瀛樿澶囦笂鍒嗗眰瀛樺偍鍥炬暟鎹紝璁$畻绯荤粺鍙互涓€娆℃€ц浇鍏ユ洿澶氱殑鐐癸紝浠庤€岄檷浣庡崟涓偣閲嶅璇诲彇鐨勬鏁帮紝杈惧埌鎻愰珮璁$畻鏁堢巼鐨勭洰鐨勩€傛祴璇曡〃鏄庡熀浜庤繖涓€璁捐瀹炵幇鐨勬柊鍨嬪瀛樺浘璁$畻绯荤粺姣斿凡鏈夌郴缁熸湁鏄庢樉鐨勬€ц兘鎻愬崌锛屾渶楂樺彲杈?6.4鍊嶇殑鍔犻€熸瘮銆? 锛?锛夋彁鍑轰簡涓€绉嶇煩闃靛浘璁$畻寮曟搸鐨勮嚜鍔ㄤ紭鍖栫畻娉曘€傝绠楁硶涓昏鍩轰簬寰幆铻嶅悎浼樺寲鐨勫師鐞嗭紝骞跺悓鏃惰€冭檻浜嗗垎甯冨紡鐜涓嬫暟鎹竴鑷存€х殑瑕佹眰銆傚湪淇濊瘉璁$畻姝g‘鎬х殑鍚屾椂锛岃绠楁硶鍙互閫氳繃鑷姩娴佹按绾垮寲鐨勬柟娉曟彁鍗囧浘璁$畻搴旂敤 鐨勬暟鎹眬閮ㄦ€э紝浠庤€屽噺灏戝唴瀛樺甫瀹藉帇鍔涖€傚疄楠岃〃鏄庤鏂规硶鏈€楂樺彲灏嗗師绋嬪簭鍔犻€?5.8鍊嶃€? 锛?锛夋彁鍑轰簡涓€绉嶉拡瀵规柊鍨嬪瓨绠楄瀺鍚堝櫒浠剁殑鍥捐绠楁ā鍨嬨€傞拡瀵瑰瓨绠楄瀺鍚堣繖涓€鍏ㄦ柊鐨勬敮鎸佺洿鎺ュ湪鍐呭瓨鍣ㄤ欢涓婅繘琛岃绠楃殑浣撶郴缁撴瀯锛屾彁鍑轰簡涓庝箣鐩搁€傞厤鐨勬柊鍨嬪浘璁$畻妯″瀷銆傝妯″瀷閫氳繃闄愬畾鐢ㄦ埛鐨勭紪绋嬫帴鍙o紝浣垮緱鑷姩鐨勯€氳鍘诲啑浣欐垚涓哄彲鑳姐€傚苟杩涗竴姝ユ彁鍑轰簡鍩轰簬骞挎挱鏍戠殑鏇存柊浼犳挱绠楁硶锛屽彲浠ユ湁鏁堝噺灏戠摱棰堥摼璺笂鐨勯€氳閲忋€傝绠楃粨鏋滄樉绀猴紝涓婅堪涓ょ鏂规硶鏈€楂樺彲浠ュ噺灏戣繎 95%鐨勯€氳閲忋€? 鍏抽敭璇嶏細鍥捐绠?鍒嗗竷寮忚绠?鐭╅樀璁$畻;灞€閮ㄦ€? PIM Abstract Due to its good expressivity, graph has been widely used to model the relationship among data elements. As a result, many large-scale graph processing systems have been proposed and deployed in the real world, and have achieved great successes. The optimization for these systems is also a hot research topic in both the academic and industry. However, the current implementation of graph processing systems are typically based on certain simpli铿乧ation assumptions, such as 鈥渧ertex is indivisible鈥?and 鈥渆ach compute operation can be executed isolatedly鈥? and hence cannot achieve the best performance that the hardware can deliver. To resolve this problem, we investigate the 铿乪ld and found that the bottleneck of most graph applications is the speed of loading data. Based on this observation, we partition the load path of graph data into four stages and propose optimizations for them respectively. The main innovations of this book are as follows. (1) We design a novel 3D graph partition algorithm. This algorithm is based on the fact that the vertex property of many graph applications is actually divisible. Through exploring this novel dimension that have never been considered by existing methods, our algorithm can reduce up to 90.6% of the original communication cost. (2) We de铿乶e a layered graph organization format. This format enables the processing system to load more vertices at a time, and hence reduce the average loading times of each vertex. As a result, our method can decrease total disk I/O for out-of-the-core graph processing systems and leads to up to a 6.4 times speedup. (3) We propose an automatic optimization algorithm for matrix execution engine. This algorithm is mainly based on loop fusion and has also considered the consistency requirement of a distributed environment. It is able to assure the correctness, and simultaneously, achieve a speedup up to 5.8 times. (4) We implement a process-in-memory-oriented graph processing framework. By enforcing certain constraints in the programming model, our framework makes it possible to automatically remove redundancy in the communication. It also provides a broadcast-based optimization that reduce communication load on bottleneck links. According to our calculation, it can reduce the communication cost to as low as only 5% of the original. Key words: graph computing; distributed computing; matrix computing; locality; PIM

more >
扫描二维码
下载APP了解更多

同系列产品more >

水墨人物画构成形式研究

刘翔鹏
定 价:99元

查看详情
领导力、公共服务动机与中国农村集...

舒全峰
定 价:79元

查看详情
神经机器翻译的联合训练(英文版)

程勇
定 价:69元

查看详情
抗战时期营养保障体系的创建与中国...

王公
定 价:95元

查看详情
20世纪以来中国绘画中的自然主题

邱敏
定 价:89元

查看详情
图书分类全部图书
more >
  • “清华大学优秀博士学位论文丛书”(以下简称“优博丛书”)精选自2014年以来入选的清华大学校级优秀博士学位论文(Top 5%)。每篇论文经作者进一步修改、充实并增加导师序言后,以专著形式呈现在读者面前。“优博丛书”选题范围涉及自然科学和人文社会科学各主要领域,覆盖清华大学开设的全部一级学科,代表了清华大学各学科最优秀的博士学位论文的水平,反映了相关领域最新的科研进展,具有较强的前沿性、系统性和可读性,是广大博硕士研究生开题及撰写学位论文的必备参考,也是科研人员快速和系统了解某一细分领域发展概况、最新进展以及创新思路的有效途径。
  • “清华大学优秀博士学位论文丛书”(以下简称“优博丛书”)精选自2014年以来入选的清华大学校级优秀博士学位论文(Top 5%)。每篇论文经作者进一步修改、充实并增加导师序言后,以专著形式呈现在读者面前。“优博丛书”选题范围涉及自然科学和人文社会科学各主要领域,覆盖清华大学开设的全部一级学科,代表了清华大学各学科最优秀的博士学位论文的水平,反映了相关领域最新的科研进展,具有较强的前沿性、系统性和可读性,是广大博硕士研究生开题及撰写学位论文的必备参考,也是科研人员快速和系统了解某一细分领域发展概况、最新进展以及创新思路的有效途径。
more >
  • 鐩綍

    绗?1绔犲紩瑷€ .......................................................................................1

    1.1澶ц妯″浘璁$畻 .........................................................................1

    1.2鍥捐绠楃郴缁熺殑鍒嗙被 ...................................................................2

    1.3鍥炬暟鎹珮鏁堣绠楃殑鎸戞垬 ............................................................5

    1.3.1鍥捐绠楃殑鐗圭偣 ...............................................................6

    1.3.2鐜扮姸鍜屼富瑕佷紭鍖栨柟鍚?.....................................................7

    1.4涓昏璐$尞 ................................................................................9

    1.5鏈功缁勭粐缁撴瀯 ...........................................................

精彩书评more >

标题

评论

版权所有(C)2019 清华大学出版社有限公司 京ICP备10035462号 京公网安备11010802013248号

联系我们 | 网站地图 | 法律声明 | 友情链接 | 盗版举报 | 人才招聘