Other

EUVL Modeling With Panoramic Technology

Panoramic Technology has been the leader in EUVL modeling since 1999. Both HyperLith and EM-Suite are capabile of simulating many aspects of EUVL.

embedded multilayer mirror defects (download a recent presentation)
defect printability (download a recent presentation)
off-axis effects, shadowing, H-V bias
absorber stack effects
absorber pattern effects
absorber pattern defects
multilayer mirror structure
defect inspectability (at DUV and EUV wavelengths) (download a recent presentation)

Several Important Features Make Panoramic the Most Powerful

Fourier Boundary Condition
Ultra-large FDTD simulation domains (>2 billion cells, >100GB memory)
Distributed Computing
Hardware Acceleration
EUV Defect Generator
Calibrated EUV Resist Models
Advanced Resist Modeling Infrastructure (ARMI) for developing new EUV resist models (including stochasitic models)
Experience & Support

What Panoramic Technology Offers

Simulation Software (EM-Suite, HyperLith, TEMPESTpr2, SimRunner/PSS/HSS, Resist, ARMI, SOAPI)
Consulting Services
Combination of Consulting and Software

We're smarter!

We're faster.

Our in-house programmers are amazing. They can implement and deploy new features in a flash. They know all aspects of the code inside-and-out.

We're more motivated.

We are a small, employee-owned company. Our employees are highly motivated and success-oriented.

We're more focused.

We're not some small part of a bigger company. We're not trying to sell machines or EDA software. We're focused on selling simulation software for advanced lithography research - and that's it!

We're smaller and more efficient.

(a nice way to say that we're frugal!)

We have a better business model.

We feel that being a small, independent company is the way to go. We spend less on advertising, and more on development. We're not after a quick-buck - we prefer slow and steady growth. We're not rapidly raising prices - we want our customers to be accustomed to our steady and reasonable prices. We offer a better simulator at a lower price.

At Panoramic Technology we have been continuously improving our lithography simulation software for over ten years. We're always at the forefront of the technology, being first to implement new features before our competitors (EUV, polarization issues in immersion, distributed computing, wafer topography etc.)

We continue to maintain our vigorous pace of development and we continue to raise the bar for lithography research simulators.

Near Term Plans

Advanced Resist Modeling Infrastructure (ARMI) - gives the user the ability to do advanced resist model development "in-house".
- PanTune - a general purpose "tuner" that can be used for resist model parameter calibration.
- User-Written Resist Models (UWRM) - user can program their own custom resist models and insert them into the EM-Suite/HyperLith simulation infrastructure - as a direct peer to the existing resist models.
Continued resist modeling research - we are working with several customers on tuning resist parameter sets, and solving modeling issues with EUV and DUV resists.
Extend SOAPI (our MATLAB(TM)/Java(TM) API) to HyperLith
Improve Gazillion and PanOPC integration into HyperLith
Application notes, examples, training-videos, documentation
Wafer-topography/Double-Patterning research, modeling, GUI improvements
Develop "quasi-rigorous" mask models for EUV

Long Term Plans

Maintain the lead in power and flexibility
Incorporate new features as the technology advances
Continuously improve speed, and ability to simulate larger areas
Larger area (but not full-chip) OPC correction that is better suited to manufacturing
Continuous GUI improvements - ease-of-use-without-sacrificing-power-and-flexibility always a priority
Source-Mask-Optimization & Inverse Algorithms

Ultimately, we plan to become the dominant lithography "research" simulator. We feel this will happen because we deliver the most powerful, most flexible lithography simulator at a price that can not be matched by our competitors. (see How and Why Can Panoramic Technology Offer the Best Lithography Research Simulator for Such an Amazing Low Price?)

Panoramic Technology Inc. has the following positions open:

Photoresist Modeling/Applications Engineer

Requirements

Photoresist modeling experience required.
Lithography simulation experience required.
Candidate should have puplished papers in the field of resist simulation and modeling at conferences such as SPIE Advanced Lithography and SPIE Photomask.

Position will involve

developing photoresist models
tuning resist parameter sets
working with customers on photoresist modeling and litho simulation in general
photoresist and lithography simulation research

Benefits

extremely exciting small-company atmosphere
ability to work from your home (even if you live in Texas for example!)
or, working in Berkeley, CA
benefits: retirement plan, health insurance
salary will be comptitive - with significant bonus potential

The goal of this page is not to demonstrate the raw speed of the simulator, but rather to demonstrate the speed-up that can be obtained by using multiple cores, processors, GPU's in different ways. You can run these simulations on your own machine (they are based off the examples that ship with the software) and see how your hardware compares to the machines we've tested.

#	Simulation	Hardware	PSS/HSS configuration	PSS/HSS license requirement	Effective Cycle Time (s/cycle)^*	Comment
A1	Elbow.sim, 3GB, 3D EUV with Fourier Boundary Condition, non-complex	Box #1: 2x Opteron 285, 16GB DDR400	1x 1-threaded-PSS	1/0	187	Single-core (i.e. no SimRunner)
A2	"	"	1x 4-threaded-PSS	4/0	95	4 cores give 2X speedup with multi-threading (4 cores working on one job)
A3	"	"	2x 1-threaded-PSS	2/0	104	2 cores give almost 2X speedup with job-distribution (2 cores working on two jobs independently). This is always more efficient than multi-threading, but requires more memory.
A4	"	"	2x 2-threaded-PSS	4/0	61	combination of multi-threading and job distribution seems optimal - 4 cores giving 3X speedup - requires memory for two simulations. Seems reasonable on AMD dual-core architecture where each processor (pair of cores) has it's own memory controller and "close" memory.
A5	"	"	1x SuperPSS -{4x 1-threaded PSS}	4/0	64	almost 3X speedup with 4 cores, but uses less memory than #A4. Much faster than #A2.
A6	"	Box #1: 2x Opteron 280, 16GB DDR 400 Box #2, 2x Opteron 270, 16GB DDR 400	1x SuperPSS -{8x 1-threaded PSS}	8/0	60	Not much faster than #A4 or #A5. Uses less memory per machine than #A4.
A7	"	"	2x SuperPSS{2x 2-threaded-PSS}	8/0	49	The Opteron 270 machine is slower. If both machines were opteron 285's than we would expect double the performance of #A4.
A8	"	Box #1: 2x Tesla C870	1x 2-GPU-HSS	0/2	18	Simulation fits entirely within two cards.
A9	"	Box #1: 1x Tesla C870	1x 1-GPU_HSS	0/1	29	More than 2X faster than #A4. (1 HSS license vs. 4 PSS licenses)
A10	"	Box #1: 2x Tesla C870 Box #2: 2x Tesla C870	2x 2-GPU-HSS	0/4	9	Double the performance of #A8 (running two cases at once)
A11	"	"	1x SuperPSS{2x 2-GPU-HSS}	0/4	43	Bad performance because of communication overhead for SuperPSS.
B1	AltPSM_Contacts with pitch=2.2 (9.1GB)	Box #1: 2x Opteron 285, 16GB DDR366, 2x C870	1x 1-GPU_HSS	0/1	162	The DDR 400 memory was slowed to 366MHz.
B2	“	Box #2: 2xIntel 5440 32GB, DDR2 667, 1X C870	1x 1-GPU_HSS	0/1	135	This machine has faster memory compared with B1.
B3	“	“	2x SuperPSS{4x 1-threaded-PSS}	8/0	288	Using all 8 cores is slower than 1x Tesla C870 on the same machine. (see B2)
B4	“	Box #1: 2x Opteron 285, 16GB DDR366, 2x C870	1x 2-GPU_HSS	0/2	101	Using 2 C870's compared to 1 C870 gives 101s to 162s. So, don't get a 2X speedup (as expected) – but do get a decent speed-up (162/101=1.6X speedup)
B5	“	Box #2: 2xIntel 5440 32GB, DDR2 667, 1X C870	1x 8-threaded PSS	8/0	326	See B3.
B6	“	“	2x SuperPSS{2x 2-threaded-PSS}	8/0	314	See B5 & B3.
B7	“	“	1x SuperPSS{1x 8-threaded-PSS, 1x 1-GPU_HSS}	8/1	287	Better to just use HSS alone. The PSS's can't help it – just slow it down. See B2.
C1	AltPSM_Contacts, pitch=0.3 (169MB)	Box #2: 2xIntel 5440 32GB, DDR2 667, 1X 8800 GT-OC	1x 1-GPU_HSS	0/1	1.57	This is just a graphics card (8800 GT-OC) with 512MB GDDR3 memory. The card was driving video during the simulation (maybe a bit faster without video)
C2	“	Box #2: 2xIntel 5440 32GB, DDR2 667, 1X C870	“	0/1	1.29	Compare to C1. The TESLA C870 beats the less expensive 8800 GT-OC even for small simulation that fits entirely with the card's memory.
C3	“	Box #2: 2xIntel 5440 32GB, DDR2 667	1x 1-threaded-PSS	1/0	9.90	Tesla C870 is 7.67X faster than single core of Intel 5440. 8800 GT-OC is only 6.3X faster
C4	“	Box #1: 2x Opteron 285, 16GB DDR366, 2x C870	“	1/0	9.90	Older Opteron 285 same speed as newer Intel 5440!?
C5	“	“	1x 1-GPU_HSS	0/1	1.35	Tesla C870 on Opteron 285 with 366MHz DDR is slower than Tesla C870 on Intel 5440 with DDR2 667MHz. (expected)
D1	AltPSM_Contacts, pitch=0.8 (1.2GB)	Box #1: 2x Opteron 285, 16GB DDR366, 2x C870	1x 1-GPU_HSS	0/1	9.9	Simulation fits entirely within the Tesla C870's 1.5GB memory.
D2	“	“	1x 1-threaded-PSS	1/0	143	Compare to D1. Here the Tesla C870 is 14X faster than the Opteron 285 Processor. This is the “sweet spot” for the C870 because the simulation is large, but still fits inside the card.
D3	“	Box #2: 2xIntel 5440 32GB, DDR2 667	“	1/0	83	Here we see the newer Intel 5440/DDR2 667MHz beating the older Opteron 285/DDR 366MHz (expected)
D4	“	Box #2: 2xIntel 5440 32GB, DDR2 667, 1x C870	1x 1-GPU_HSS	0/1	8.7	Here we we 9.5X speed-up when compared to late model Intel 5440 processor. Note, this cycle time is faster than C870 on the older Opteron machine (D1). So, host system does matter.
E1	AltPSM_Contacts, pitch=0.3 (169MB)	Box #1: 2x Opteron 285, 16GB DDR366, 2x C870	1x 1-GPU_HSS	0/1	1.35	compare with E1a
E1a	"	" (but with 2XC1060)	"	"	0.67	compare with E1 - the C1060 has 2X the processing power as the C870
E2	"	" (but with 2X C870)	1x 2-GPU_HSS	0/2	1.42	as expected no improving when using more cards on a small simulation that fits within one card (compare to E1)
E2a	"	" (but with 2X C1060)	"	"	.65	basically same as E1a
E3	"	" (but with 2X C870)	2x 1-GPU_HSS	0/2	.68	running two simulations at the same time - compare to E1
E3a	"	" (but with 2X C1060)	"	"	.36	" - compare to E1a
E4	Elbow.sim, 3GB, 3D EUV with Fourier Boundary Condition, non-complex	" (but with 2X C870)	2x 1-GPU_HSS	0/2	17.5	Compare with E4a
E4a	"	" (but with 2X C1060)	"	"	8.25	C1060 more than 2X faster than C870 - compare with E4
E5	"	" (but with 2X C870)	1x 2-GPU_HSS	0/2	18.4	Compare with E5a
E5a	"	" (but with 2X C1060)	"	"	14.8	Not so great improvement of C870 is expected because 2nd card is not utilized at all as simulation fits within the first card. In E5, both C870's are running at same time, in here (E5a) only one card is running while the other sits idle.
E6	Elbow.sim, with 6 degree incidence (complex simulation) and pitch=76nm, 10GB, 3D EUV with Fourier Boundary Condition	" (but with 2X C870)	1x 2-GPU_HSS	0/2	92	domain divided into 7 parts - the first 6 parts run in simultaneous pairs, and the 7th part runs on one card while the other remains idle - card utilization is 7/8=87.5% (excluding CPU memory xfer overhead)
E6a	"	" (but with 2X C1060)	"	"	67	domain divided into 3 parts - the first 2 parts run simultaneously, and the 3rd parts runs on one card while the other remains idle - card utilization is 3/4=75% (excluding CPU memory xfer overhead) The reason there is not 2X speedup over E6 is because GPU utilization is lower, and CPU xfer overhead might be large - especially since box has DDR 336 (not even DDR2) and only PCI x16 generation 1 (not generation 2.0). Probably with PCI Express x16 (gen 2) and DDR2 - 800, improvement will be closer to 2X.

^*Note: "Effective" cycle time is the total cycle time divided by the number of cases running. For example, if you have 5 PSS's running 5 different simulations (of the same size) and each has a cycle time of 10s, then the effective cycle time would be 10s/5=2s. A "cycle" is amount of time TEMPESTpr2 takes to propagate the fields one wavelength.