Session <Full Schedule · Contributors · Organizations · Search Program · My Schedule · Happening Now · MapsMore…Search ProgramMy ScheduleHappening NowMapsPosters, Research Posters: Research Posters DisplayEvent TypePosters, Research PostersTimeTuesday, 14 November 202310am - 5pm MSTLocationDEF ConcourseRegistration Categories TP XO/EX PresentationsPanSim: A Performance-Portable Agent Based ModelAuthorsIstvan Z. RegulyBence Keömley-HorváthGábor SzederkényiAttila Csikász-NagyAres – Simulating Type Ia Supernovae on Heterogeneous HPC ArchitecturesAuthorsLandon DykenAlexander HolasMark Ivan UgalinoMd Nageeb Bin ZamanBalancing Latency and Throughput of Distributed Inference by Interleaved ParallelismAuthorsJiangsu DuJinhui WeiJiazhi JiangShenggan ChengZhiguang ChenDan HuangYutong LuScalable Algorithms for Analyzing Large Dynamic Networks Using CANDYAuthorsAashish PandeyArindam KhandaSriram SrinivasanSudharshan SrinivasanS. M. ShovanFarahnaz HosseiniSajal DasBoyana NorrisSanjukta BhowmickParallel Optimization Methods for Direct Numerical Simulation of High Reynolds Number Wall Turbulence with a Grid Size of 100 BillionAuthorsJiabin XieGuangnan FengHan HuangJunxuan FengYutong LuPerformant Low-Order Matrix-Free Finite Element Kernels on GPE ArchitecturesAuthorsRandolph SettgastWilliam TobinNicola CastellettoYohann DudouitSergey KlevtsovBen CorbettIntroducing Prefetching and Data Compression to Accelerate Checkpointing for Inverse Seismic ProblemsAuthorsSandro RigoThiago MaltempiMarcio PereiraHervé YviquelJessé CostaGuido AraujoGPU-Accelerated Dense Covariance Matrix Generation for Spatial Statistics ApplicationsAuthorsZipei GengSameh AbdulahHatem LtaiefYing SunMarc GentonDavid KeyesParLeiden: Boosting Parallelism of Distributed Leiden Algorithm on Large-Scale GraphsAuthorsYongmin HuJing WangCheng ZhaoYibo LiuCheng ChenXiaoliang CongChao LiScalable Reduced-Order Modeling for Three-Dimensional Turbulent FlowAuthorsKazuto AndoRahul BaleAkiyoshi KurodaMakoto TsubokuraUnstructured Finite Element Models of Cardiac Electrophysiology Using a Deal.II-Based LibraryAuthorsLaryssa AbdalaSimone RossiDavid WellsBoyce GriffithA Methodology for Accelerating Variant Calling on GPUAuthorsBeatrice BranchiniAlberto ZeniMarco SantambrogioDeveloping an Inverse Reinforcement Learning Methodology to Predict the Progression of Colorectal CancerAuthorsSilba DowellDaniel HintzTyson LimatoShad SellersMilana WolffNicholas ChiaLiudmila MainzerAccelerating Actor-Based Distributed Triangle CountingAuthorsAniruddha MysoreKaushik RavichandranYoussef ElmougyAkihiro HayashiVivek SarkarScaling K-Path Centrality Using Optimized Distributed Data StructureAuthorsLance FletcherTrevor SteilRoger PearceSimulating Quantum Systems with NWQ-Sim on HPCAuthorsIn-Saeng SuhAng LiA Hybrid Factorization Solver with Mixed Precision Arithmetic for Sparse MatricesAuthorAtsushi SuzukiTowards Enabling Digital Twins Capabilities for a Cloud ChamberAuthorsJiaqi YangMohammad AtifVanessa Lopez-MarreroTao ZhangKwang Min YuMeifeng LinLingda LiFan YangYangang LiuAbdullahalmut SharfuddinFoluso LadeindeHigh-Performance PMEM-Aware Collective I/OsAuthorsKeegan SanchezAlex GavinSuren BynaKesheng WuXuechen Zhang Architecture and Networks I/O and File Systems An Early Case Study with Multi-Tenancy Support in SPDK’s NVMe-over-Fabric DesignsAuthorsDarren NgCharles ParkinsonAndrew LinArjun KashyapXiaoyi LuOptimizing Workflow Performance by Elucidating Semantic Data FlowAuthorsMeng TangNathan R. TallentAnthony KougkasXian-He SunThe Many Facets of a Dynamic Graph Processing SystemAuthorsJuntong LuoScott SallinenMatei Ripeanusys-sage: A Fresh View on Dynamic Topologies and Attributes of HPC SystemsAuthorsStepan VanecekMartin SchulzSimulating Application Agnostic Process Assignment for Graph Workloads on Dragonfly and Fat Tree TopologiesAuthorsMd Nahid NewazSayan GhoshJoshua SuetterleinNathan TallentHua MingGeospatial Filter and Refine Computations on NVIDIA Bluefield Data Processing Units (DPU)AuthorsDerda KaymakSatish PuriNeoRodinia: Evaluation of High-Level Parallel Programming Models and Compiler Transformation for GPU OffloadingAuthorsXinyao YiAnjia WangYonghong YanIntegrating TEZIP into LibPressio: A Case Study of Integrating a Dynamic Application into a Static C EnvironmentAuthorsIsita TalukdarAmarjit SinghRobert UnderwoodKento SatoWeikuan YuCharacterizing GPU Effectiveness on NRP for IceCube fp32 ComputeAuthorsDavid SchultzIgor SfiligoiBenedikt RiedelFrank WürthweinExploring Userspace Memory Mapping for RDMA-Enabled Network-Attached MemoryAuthorsJacob WahlgrenJennifer FajEric GreenMaya GokhaleIvy PengMinimizing Data Movement Using Distant FuturesAuthorsBarry Sly-DelgadoDouglas ThainWhy Wait!? Hades: An Active, Content-Aware System for Precalculating Derived QuantitiesAuthorsJaime CernudaLuke LoganAnthony KougkasXian-He SumExploring Green Cryptographic Hashing Algorithms for Eco-Friendly BlockchainsAuthorsAahad AbubakerTanmay AnandSonal GaikwadMahad HaiderJacklyn McAninchLan NguyenAlexandru OrheanIoan RaicuAutomating HPC Model Selection on Edge DevicesAuthorsAbrar HossainKishwar AhmedGraph Based Anomaly Detection in Chimbuko: Feasible or Fallible?AuthorsChase PhelpsAnkur LahiryTanzima Z. IslamChristopher KellyInvestigating Anomalies in Compute Clusters: An Unsupervised Learning ApproachAuthorsYiyang LuJie RenYasir AlanaziAhmed MohammedDiana McSpaddenLaura HildMark JonesWesley MooreMalachi SchramBryan HessEvgenia SmirniTemporal Classification of Allocations for Reduced Memory UsageAuthorsKristi BelcherDavid BeckingsaleSam SchwartzMarty McFaddenToward Inductive Synthesis of Compiler Heuristics: A Case Study with Register AllocationAuthorsMohammad AliApan QasemNeural Domain Decomposition for Variable Coefficient Poisson SolversAuthorsSebastian BarschkisZitong LiHengjie WangAparna ChandramowlishwaranSoftware Development Case Study: The Acceleration of a Distributed Application Using GPUsAuthorsMartin KuhnelAlex LoddochTao SunDelivering Digital Skills Across the Digital Divide: Creating an Accessible On-Demand Self-Paced HPC Virtual Training LabAuthorsBryan JohnstonLara TimmMabatho HashatsiEE-HPC – A Framework for Energy Efficient HPC System OperationAuthorsJan EitzingerThomas GruberReal-Time Change Point Detection in Molecular Dynamics Streaming DataAuthorsVijayalakshmi SaravananShinjae YooHubertus Van DamChristopher KellyThomas FlynnPerry SiehienKalyan MuppudojoAniket Kumar RameshA High-Performance I/O Framework for Accelerating DNN Model Updates Within Deep Learning WorkflowAuthorsJie YeJaime CernudaBogdan NicolaeAnthony KougkasXian-He SunHPC Accelerated Generative Deep Learning Approach for Creating Digital Twins of Climate ModelsAuthorsJohannes MeuerChristopher KadowThomas LudwigClaudia TimmreckA Portable Software Environment for Ultrahigh-Resolution ELM Development on GPUsAuthorsFranklin EaglebargerDali WangOptimizing Uncertainty Quantification of Vision Transformers in Deep Learning on Novel AI ArchitecturesAuthorsErik PautschJohn LISilvio RizziGeorge ThiruvathukalMaria PantojaTwo-Phase IO Enabling Large-Scale Performance IntrospectionAuthorsKe FanSidharth Kumar Performance Measurement, Modeling, and Tools Characterizing One-/Two-Sided Designs in OpenSHMEM CollectivesAuthorsYuke LiYanfei GuoXiaoyi LuModeling Parallel Programs Using Large Language ModelsAuthorsDaniel NicholsAniruddha MaratheHarshitha MenonTodd GamblinAbhinav BhateleMPI Performance Analysis in Vlasiator: Unraveling Communication BottlenecksAuthorsJennifer FajJeremy J. WilliamsIvy B. PengUrs GanseMarkus BattarbeeYann Pfau-KempfLeo KotipaloMinna PalmrothStefano MarkidisExploring Julia as a Unifying End-to-End Workflow Language for HPC on FrontierAuthorsWilliam F. GodoyPedro Valero-LaraCaira AndersonKatrina W. LeeAna GainaruRafael Ferreira da SilvaJeffrey S. VetterExploring the Impacts of Multiple I/O Metrics in Identifying I/O BottlenecksAuthorsIzzet YildirimHariharan DevarajanAnthony KougkasXian-He SunKathryn MohrorPipit: Simplifying Analysis of Parallel Execution TracesAuthorsAlexander MovsesyanRakrish DhakalAditya RanjanJordan MarryOnur CankurAbhinav BhateleCharacterizing the Performance of the Implicit Massively Parallel Particle-in-Cell iPIC3D CodeAuthorsJeremy Johnathan WilliamsDaniel MedeirosIvy PengStefano MarkidisEarly Experience in Characterizing Training Large Language Models on Modern HPC ClustersAuthorsHao QiLiuyao DaiWeicong ChenXiaoyi LuTransfer Learning Workflow for High-Quality I/O Bandwidth Prediction with Limited DataAuthorsDmytro PovaliaievRadita LiemJulian KunkelJay LofsteadPhilip CarnsDFToy: A New Proxy App for DFT CalculationsAuthorsArjen TamerusPhil HasnipHybrid CPU-GPU Implementation of Edge-Connected Jaccard Similarity in Graph DatasetsAuthorsAtharva GondhalekarPaul SathreWu-chun FengPreserving Data Locality in Multidimensional Variational Quantum ClassificationAuthorsMingyoung JengAlvir NobelVinayak JhaDavid LevyDylan KneidelManu ChaudharyIshraq IslamEsam El-Araby Artificial Intelligence/Machine Learning Post-Moore Computing Quantum Computing SCALABLE – Scalable Lattice Boltzmann Leaps to ExascaleAuthorsJayesh BadwaikLubomír ŘíhaRadim VavříkOndřej VysockýKristian KadlubiakGabriel StaffelbachMarkus HolzerPhilipp SuffaRomain CuidardDenis RicotImproving Memory Interfacing in HLS-Generated Accelerators with Custom CachesAuthorsClaudio BaroneGiovanni GozziMichele FioritoAnkur LimayeAntonino TumeoFabrizio Ferrandi Programming Frameworks and System Software Evaluating Performance Portability of GPU Programming ModelsAuthorsJoshua H. DavisPranav SivaramanIsaac MinnAbhinav Bhatele Heterogeneous Computing Performance Measurement, Modeling, and Tools The Impact of Process Topology on RMA Programming Models: A Study on NERSC PerlmutterAuthorsNikodemos KoutsoherasSayan GhoshNathan TallentJoshua SuetterleinAbhinav BhateleScalable Fine-Grained Gang Scheduling for HPC Systems with Unreliable Broadcast Synchronization MechanismsAuthorsHiroki OhtsujiErika HayashiReika KinoshitaMasahiro MiwaEiji YoshidaSophisticated Tools for Performance Analysis and Auto-Tuning of Performance Portable Parallel ProgrammingAuthorsVivek KaleDavid BoehmeKevin HuckShravan KaleVanessa SurjadidjajaJames BrandtThat's Right – The Same C++ STL Asynchronous Parallel Code Runs on CPUs and GPUsAuthorsMuhammad HaseebWeile WeiJack DeslippeBrandon CookSimulating Larger Quantum Circuits with Circuit Cutting and Quantum ServerlessAuthorsCaleb JohnsonBryce FullerJim GarrisonJennifer GlickQuantum Task Offloading with the OpenMP APIAuthorsJoseph K. L. LeeMartin RuefenachtJohannes DoerfertOliver Thomson BrownMark BullMichael KlemmMartin SchulzUnleashing CGRA Potential for HPCAuthorsBoma AdhiEmanuele Del SozzoCarlos CortesXinyuan WangTomohiro UenoKentaro SanoQuantum Computing Case Study in Aerospace FieldAuthorsNaoyuki FujitaYuusuke TakemotoSusumu TakatsuYasuyuki NishibayashiMitsuhiro HashimotoRyo SakuraiMitsuharu TakeoriTakahiro YamamotoJiayun ZhuRadium: Transparent Distributed Execution via Process VirtualizationAuthorsAidan CullyHusheng ZhouDusan VeljkoHyojong KimVance MillerJoel ZambranoMazhar MemonQASM-to-HLS: A Framework for Accelerating Quantum Circuit Emulation on High-Performance Reconfigurable ComputersAuthorsAnshul MauryaNaveed Mahmud