14277:73d5e60b3a7c |
06-Sep-2019 |
Gabe Black <gabeblack@google.com> |
arch, x86: Rework the debug faults and microops.
This makes the non-fatal microops advance the PC, and adds missing functions. The *_once Faults now also can be run once per *something*. They would previously be run once per Fault invoke function which is common to all M5WarnOnceFaults. The warn_once microop will now warn once per message.
Change-Id: I05974b93f3b2700077a411b243679c2ff0e8c2cb Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20739 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13613:a19963be12ca |
20-Nov-2018 |
Gabe Black <gabeblack@google.com> |
x86: Stop using/defining some ISA specific register types.
These have been replaced with the generic RegVal type.
Change-Id: I75c1134212067dea43aa0903d813633e06f3d6c6 Reviewed-on: https://gem5-review.googlesource.com/c/14476 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
13611:c8b7847b4171 |
19-Nov-2018 |
Gabe Black <gabeblack@google.com> |
arch: cpu: Rename *FloatRegBits* to *FloatReg*.
Now that there's no plain FloatReg, there's no reason to distinguish FloatRegBits with a special suffix since it's the only way to read or write FP registers.
Change-Id: I3a60168c1d4302aed55223ea8e37b421f21efded Reviewed-on: https://gem5-review.googlesource.com/c/14460 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Gabe Black <gabeblack@google.com> |
13441:d70ffc3dabf0 |
20-Nov-2018 |
Gabe Black <gabeblack@google.com> |
x86: Get rid of a problematic DPRINTF in PremFp.
This DPRINTF shouldn't be necessary since it shows the operands and results of the instruction which the trace should already make available. Also by passing the destination register to DPRINTF, the ISA parser will assume that it's also a source when tracking dependencies.
Change-Id: I820387c82578bdbb8d2e3d91652a6c0185077f54 Reviewed-on: https://gem5-review.googlesource.com/c/14475 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Gabe Black <gabeblack@google.com> |
12707:7819f067a128 |
23-May-2018 |
Gabe Black <gabeblack@google.com> |
x86: Add op classes to the MediaOps.
The ISA parser had been assuming these microops were all FloatAddOp which is usually not correct.
Change-Id: Ic54881d16f16b50c3d6a8c74b94bff9ae3b1f43e Reviewed-on: https://gem5-review.googlesource.com/10541 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Tariq Azmy <tariqslayer01@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> |
12682:dfc3bb0db088 |
13-Apr-2018 |
Gabe Black <gabeblack@google.com> |
x86: Add a ld/st microop flag for marking an access uncacheable.
This percolates down to the memory request object which will have its "UNCACHEABLE" flag set.
Change-Id: Ie73f4249bfcd57f45a473f220d0988856715a9ce Reviewed-on: https://gem5-review.googlesource.com/9881 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> |
12588:c007da6c777a |
29-Jan-2018 |
Gabe Black <gabeblack@google.com> |
x86: Add bitfields which can gather/scatter bases and limits.
Add bitfields which can gather/scatter base and limit fields within "normal" segment descriptors, and in TSS descriptors which have the same bitfields in the same positions for those two values.
This centralizes the code which manages those bitfields and makes it less likely that a local implementation will be buggy.
Change-Id: I9809aa626fc31388595c3d3b225c25a0ec6a1275 Reviewed-on: https://gem5-review.googlesource.com/7661 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
12463:84f365522633 |
15-Jan-2018 |
Swapnil Haria <swapnilster@gmail.com> |
arch-x86: Adding clflush, clflushopt, clwb instructions
This patch adds support for cache flushing instructions in x86. It piggybacks on support for similar instructions in arm ISA added by Nikos Nikoleris. I have tested each instruction using microbenchmarks.
Change-Id: I72b6b8dc30c236a21eff7958fa231f0663532d7d Reviewed-on: https://gem5-review.googlesource.com/7401 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
12407:c24d0c2d816d |
20-Dec-2017 |
Gabe Black <gabeblack@google.com> |
riscv,x86: Stop using the arch Nop machine instruction unnecessarily.
That particular ExtMachInst is a convenient placeholder, but a value of 0 in RISCV or a static uninitialized ExtMachInst (which will therefore be all zeroes) on x86 works just as well, and removes the need for an ISA specific constant.
Also, the idea of a universal Nop doesn't always make sense since it could be that what, exactly, doesn't do anything depends on context which would be lost on a constant value of an ExtMachInst. For instance, the value of an ExtMachInst that makes sense might depend on what mode the CPU was in, etc.
Change-Id: I1f1a43a5c607a667e11b79bcf6e059e4f7141b3f Reviewed-on: https://gem5-review.googlesource.com/6825 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Alec Roelke <ar4jc@virginia.edu> Maintainer: Gabe Black <gabeblack@google.com> |
12392:e0dbdf30a2a5 |
13-Dec-2017 |
Jason Lowe-Power <jason@lowepower.com> |
misc: Updates for gcc7.2 for x86
GCC 7.2 is much stricter than previous GCC versions. The following changes are needed:
* There is now a warning if there is an implicit fallthrough between two case statments. C++17 adds the [[fallthrough]]; declaration. However, to support non C++17 standards (i.e., C++11), we use M5_FALLTHROUGH. M5_FALLTHROUGH checks for [[fallthrough]] compliant C++17 compiler and if that doesn't exist, it defaults to nothing (no older compilers generate warnings). * The above resulted in a couple of bugs that were found. This is noted in the review request on gerrit. * throw() for dynamic exception specification is deprecated * There were a couple of new uninitialized variable warnings * Can no longer perform bitwise operations on a bool. * Must now include <functional> for std::function * Compiler bug for void* lambda. Changed to auto as work around. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82878
Change-Id: I5d4c782a4e133fa4cdb119e35d9aff68c6e2958e Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/5802 Reviewed-by: Gabe Black <gabeblack@google.com> |
12384:481add71d2e4 |
12-Dec-2017 |
Gabe Black <gabeblack@google.com> |
x86: Rework how "split" loads/stores are handled.
Explicitly separate the way the data is represented in the underlying representation from how it's represented in the instruction.
In order to make the ISA parser happy, the Mem operand needs to have a single, particular type. To handle that with scalar types, we just used uint64_ts and then worked with values that were smaller than the maximum we could hold. To work with these new array values, we also use an underlying uint64_t for each element.
To make accessing the underlying memory system more natural, when we go to actually read or write values, we translate the access into an array of the actual, correct underlying type. That way we don't have non-exact asserts which confuse gcc, or weird endianness conversion which assumes that the data should be flipped 8 bytes at a time.
Because the functions involved are generally inline, the syntactic niceness should all boil off, and the final implementation in the binary should be simple and efficient for the given data types.
Change-Id: I14ce7a2fe0dc2cbaf6ad4a0d19f743c45ee78e26 Reviewed-on: https://gem5-review.googlesource.com/6582 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
12236:126ac9da6050 |
04-Nov-2017 |
Gabe Black <gabeblack@google.com> |
alpha,arm,mips,power,riscv,sparc,x86: Merge exec decl templates.
In the ISA instruction definitions, some classes were declared with execute, etc., functions outside of the main template because they had CPU specific signatures and would need to be duplicated with each CPU plugged into them. Now that the instructions always just use an ExecContext, there's no reason for those templates to be separate. This change folds those templates together.
Change-Id: I13bda247d3d1cc07c0ea06968e48aa5b4aace7fa Reviewed-on: https://gem5-review.googlesource.com/5401 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Alec Roelke <ar4jc@virginia.edu> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
12234:78ece221f9f5 |
02-Nov-2017 |
Gabe Black <gabeblack@google.com> |
alpha,arm,mips,power,riscv,sparc,x86,isa: De-specialize ExecContexts.
The ISA parser used to generate different copies of exec functions for each exec context class a particular CPU wanted to use. That's since been changed so that those functions take a pointer to the base ExecContext, so the code which would generate those extra functions can be removed, and some functions which used to be templated on an ExecContext subclass can be untemplated, or minimally less templated.
Now that some functions aren't going to be instantiated multiple times with different signatures, there are also opportunities to collapse templates and make many instruction definitions simpler within the parser. Since those changes will be less mechanical, they're left for later changes and will probably be done in smaller increments.
Change-Id: I0015307bb02dfb9c60380b56d2a820f12169ebea Reviewed-on: https://gem5-review.googlesource.com/5381 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
12106:7784fac1b159 |
05-Apr-2017 |
Rekai Gonzalez-Alberquilla <Rekai.GonzalezAlberquilla@arm.com> |
cpu: Simplify the rename interface and use RegId
With the hierarchical RegId there are a lot of functions that are redundant now.
The idea behind the simplification is that instead of having the regId, telling which kind of register read/write/rename/lookup/etc. and then the function panic_if'ing if the regId is not of the appropriate type, we provide an interface that decides what kind of register to read depending on the register type of the given regId.
Change-Id: I7d52e9e21fc01205ae365d86921a4ceb67a57178 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> [ Fix RISCV build issues ] Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2702 |
12104:edd63f9c6184 |
05-Apr-2017 |
Nathanael Premillieu <nathanael.premillieu@arm.com> |
arch, cpu: Architectural Register structural indexing
Replace the unified register mapping with a structure associating a class and an index. It is now much easier to know which class of register the index is referring to. Also, when adding a new class there is no need to modify existing ones.
Change-Id: I55b3ac80763702aa2cd3ed2cbff0a75ef7620373 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> [ Fix RISCV build issues ] Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2700 |
12025:fbfb3dd3f324 |
15-May-2017 |
Gabe Black <gabeblack@google.com> |
x86: Fix the multiplication microops.
If the operands were 64 bit, an intermediate calculation could lose a carry bit. This change rearranges that intermediate calculation if the operand width is large, and reworks the microop implementation in general in an attempt to make it easier to understand.
Change-Id: Ib36333f3f2695a33cd9623e43682de22ebd2e7ea Reviewed-on: https://gem5-review.googlesource.com/3381 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> |
11829:cb5390385d87 |
10-Feb-2017 |
Jason Lowe-Power <jason@lowepower.com> |
x86: Fix implicit stack addressing in 64-bit mode
When in 64-bit mode, if the stack is accessed implicitly by an instruction the alternate address prefix should be ignored if present.
This patch adds an extra flag to the ldstop which signifies when the address override should be ignored. Then, for all of the affected instructions, this patch adds two options to the ld and st opcode to use the current stack addressing mode for all addresses and to ignore the AddressSizeFlagBit. Finally, this patch updates the x86 TLB to not truncate the address if it is in 64-bit mode and the IgnoreAddrSizeFlagBit is set.
This fixes a problem when calling __libc_start_main with a binary that is linked with a recent version of ld. This version of ld uses the address override prefix (0x67) on the call instruction instead of a nop.
Note: This has not been tested in compatibility mode and only the call instruction with the address override prefix has been tested.
See [1] page 9 (pdf page 45)
For instructions that are affected see [1] page 519 (pdf page 555).
[1] http://support.amd.com/TechDocs/24594.pdf
Signed-off-by: Jason Lowe-Power <jason@lowepower.com> |
11711:8565c34ec32e |
21-Nov-2016 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
x86: fix issue with casting in Cvtf2i
UBSAN flags this operation because it detects that arg is being cast directly to an unsigned type, argBits. this patch fixes this by first casting the value to a signed int type, then reintrepreting the raw bits of the signed int into argBits. |
11329:82bb3ee706b3 |
06-Feb-2016 |
Alexandru Dutu <alexandru.dutu@amd.com> |
x86: revamp cmpxchg8b/cmpxchg16b implementation
The previous implementation did a pair of nested RMW operations, which isn't compatible with the way that locked RMW operations are implemented in the cache models. It was convenient though in that it didn't require any new micro-ops, and supported cmpxchg16b using 64-bit memory ops. It also worked in AtomicSimpleCPU where atomicity was guaranteed by the core and not by the memory system. It did not work with timing CPU models though.
This new implementation defines new 'split' load and store micro-ops which allow a single memory operation to use a pair of registers as the source or destination, then uses a single ldsplit/stsplit RMW pair to implement cmpxchg. This patch requires support for 128-bit memory accesses in the ISA (added via a separate patch) to support cmpxchg16b. |
11328:9512d2e25f14 |
06-Feb-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
arch, x86: add support for arrays as memory operands
Although the cache models support wider accesses, the ISA descriptions assume that (for the most part) memory operands are integer types, which makes it difficult to define instructions that do memory accesses larger than 64 bits.
This patch adds some generic support for memory operands that are arrays of uint64_t, and specifically a 'u2qw' operand type for x86 that is an array of 2 uint64_ts (128 bits). This support is unused at this point, but will be needed shortly for cmpxchg16b. Ideally the 128-bit SSE memory accesses will also be rewritten to use this support.
Support for 128-bit accesses could also have been added using the gcc __int128_t extension, which would have been less disruptive. However, although clang also supports __int128_t, it's still non-standard. Also, more importantly, this approach creates a path to defining 256- and 512-byte operands as well, which will be useful for eventual AVX support. |
11320:42ecb523c64a |
06-Feb-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
style: remove trailing whitespace
Result of running 'hg m5style --skip-all --fix-white -a'. |
11303:f694764d656d |
17-Jan-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cpu. arch: add initiateMemRead() to ExecContext interface
For historical reasons, the ExecContext interface had a single function, readMem(), that did two different things depending on whether the ExecContext supported atomic memory mode (i.e., AtomicSimpleCPU) or timing memory mode (all the other models). In the former case, it actually performed a memory read; in the latter case, it merely initiated a read access, and the read completion did not happen until later when a response packet arrived from the memory system.
This led to some confusing things, including timing accesses being required to provide a pointer for the return data even though that pointer was only used in atomic mode.
This patch splits this interface, adding a new initiateMemRead() function to the ExecContext interface to replace the timing-mode use of readMem().
For consistency and clarity, the readMemTiming() helper function in the ISA definitions is renamed to initiateMemRead() as well. For x86, where the access size is passed in explicitly, we can also get rid of the data parameter at this level. For other ISAs, where the access size is determined from the type of the data parameter, we have to keep the parameter for that purpose. |
11160:10f28b61fcb1 |
06-Oct-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
x86: implement rcpps and rcpss SSE insts
These are packed single-precision approximate reciprocal operations, vector and scalar versions, respectively.
This code was basically developed by copying the code for sqrtps and sqrtss. The mrcp micro-op was simplified relative to msqrt since there are no double-precision versions of this operation. |
11159:9459593cb649 |
06-Oct-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
x86: implement fild, fucomi, and fucomip x87 insts
fild loads an integer value into the x87 top of stack register. fucomi/fucomip compare two x87 register values (the latter also doing a stack pop). These instructions are used by some versions of GNU libstdc++. |
10805:f2c472d4ff9c |
29-Apr-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
x86: change divide-by-zero fault to divide-error Same exception is raised whether division with zero is performed or the quotient is greater than the maximum value that the provided space can hold. Divide-by-Zero is the AMD terminology, while Divide-Error is Intel's. |
10760:8f5993cfa916 |
23-Mar-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: rename Locked/LOCKED to LockedRMW/LOCKED_RMW
Makes x86-style locked operations even more distinct from LLSC operations. Using "locked" by itself should be obviously ambiguous now. |
10474:799c8ee4ecba |
16-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arch: Use shared_ptr for all Faults
This patch takes quite a large step in transitioning from the ad-hoc RefCountingPtr to the c++11 shared_ptr by adopting its use for all Faults. There are no changes in behaviour, and the code modifications are mostly just replacing "new" with "make_shared". |
10341:0b4d10f53c2d |
03-Sep-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
x86: Flag instructions that call suspend as IsQuiesce
The o3 cpu relies upon instructions that suspend a thread context being flagged as "IsQuiesce". If they are not, unpredictable behavior can occur. This patch fixes that for the x86 ISA. |
10313:01dda09b93e5 |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
x86: set op class of two fp instructions This patch sets op class of two fp instructions: movfp and pop x87 stack as IntAluOp since these instructions do not make use of the fp alu. |
10196:be0e1724eb39 |
09-May-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
arch: teach ISA parser how to split code across files
This patch encompasses several interrelated and interdependent changes to the ISA generation step. The end goal is to reduce the size of the generated compilation units for instruction execution and decoding so that batch compilation can proceed with all CPUs active without exhausting physical memory.
The ISA parser (src/arch/isa_parser.py) has been improved so that it can accept 'split [output_type];' directives at the top level of the grammar and 'split(output_type)' python calls within 'exec {{ ... }}' blocks. This has the effect of "splitting" the files into smaller compilation units. I use air-quotes around "splitting" because the files themselves are not split, but preprocessing directives are inserted to have the same effect.
Architecturally, the ISA parser has had some changes in how it works. In general, it emits code sooner. It doesn't generate per-CPU files, and instead defers to the C preprocessor to create the duplicate copies for each CPU type. Likewise there are more files emitted and the C preprocessor does more substitution that used to be done by the ISA parser.
Finally, the build system (SCons) needs to be able to cope with a dynamic list of source files coming out of the ISA parser. The changes to the SCons{cript,truct} files support this. In broad strokes, the targets requested on the command line are hidden from SCons until all the build dependencies are determined, otherwise it would try, realize it can't reach the goal, and terminate in failure. Since build steps (i.e. running the ISA parser) must be taken to determine the file list, several new build stages have been inserted at the very start of the build. First, the build dependencies from the ISA parser will be emitted to arch/$ISA/generated/inc.d, which is then read by a new SCons builder to finalize the dependencies. (Once inc.d exists, the ISA parser will not need to be run to complete this step.) Once the dependencies are known, the 'Environments' are made by the makeEnv() function. This function used to be called before the build began but now happens during the build. It is easy to see that this step is quite slow; this is a known issue and it's important to realize that it was already slow, but there was no obvious cause to attribute it to since nothing was displayed to the terminal. Since new steps that used to be performed serially are now in a potentially-parallel build phase, the pathname handling in the SCons scripts has been tightened up to deal with chdir() race conditions. In general, pathnames are computed earlier and more likely to be stored, passed around, and processed as absolute paths rather than relative paths. In the end, some of these issues had to be fixed by inserting serializing dependencies in the build.
Minor note: For the null ISA, we just provide a dummy inc.d so SCons is never compelled to try to generate it. While it seems slightly wrong to have anything in src/arch/*/generated (i.e. a non-generated 'generated' file), it's by far the simplest solution. |
10184:bbfa3152bdea |
09-May-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
arch: remove inline specifiers on all inst constrs, all ISAs
With (upcoming) separate compilation, they are useless. Only link-time optimization could re-inline them, but ideally feedback-directed optimization would choose to do so only for profitable (i.e. common) instructions. |
10042:d4405a6bcc5a |
27-Jan-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
x86: correct error in emms instruction. |
9896:e31776cf4743 |
29-Sep-2013 |
Andreas Sandberg <andreas@sandberg.pp.se> |
x86: Add support for FXSAVE, FXSAVE64, FXRSTOR, and FXRSTOR64 |
9894:c0a3920859bd |
29-Sep-2013 |
Andreas Sandberg <andreas@sandberg.pp.se> |
x86: Add support for loading 32-bit and 80-bit floats in the x87
The x87 FPU supports three floating point formats: 32-bit, 64-bit, and 80-bit floats. The current gem5 implementation supports 32-bit and 64-bit floats, but only works correctly for 64-bit floats. This changeset fixes the 32-bit float handling by correctly loading and rounding (using truncation) 32-bit floats instead of simply truncating the bit pattern.
80-bit floats are loaded by first loading the 80-bits of the float to two temporary integer registers. A micro-op (cvtint_fp80) then converts the contents of the two integer registers to the internal FP representation (double). Similarly, when storing an 80-bit float, there are two conversion routines (ctvfp80h_int and cvtfp80l_int) that convert an internal FP register to 80-bit and stores the upper 64-bits or lower 32-bits to an integer register, which is the written to memory using normal integer stores. |
9893:5924b77fb8fc |
30-Sep-2013 |
Andreas Sandberg <andreas@sandberg.pp.se> |
x86: Fix re-entrancy problems in x87 store instructions
X87 store instructions typically loads and pops the top value of the stack and stores it in memory. The current implementation pops the stack at the same time as the floating point value is loaded to a temporary register. This will corrupt the state of the x87 stack if the store fails. This changeset introduces a pop87 micro-instruction that pops the stack and uses this instruction in the affected macro-instructions to pop the stack after storing the value to memory. |
9765:da0e0df0ba97 |
18-Jun-2013 |
Andreas Sandberg <andreas@sandberg.pp.se> |
x86: Add support for maintaining the x87 tag word
The current implementation of the x87 never updates the x87 tag word. This is currently not a big issue since the simulated x87 never checks for stack overflows, however this becomes an issue when switching between a virtualized CPU and a simulated CPU. This changeset adds support, which is enabled by default, for updating the tag register to every floating point microop that updates the stack top using the spm mechanism.
The new tag words is generated by the helper function X86ISA::genX87Tags(). This function is currently limited to flagging a stack position as valid or invalid and does not try to distinguish between the valid, zero, and special states. |
9764:7e744dcb1904 |
18-Jun-2013 |
Andreas Sandberg <andreas@sandberg.pp.se> |
x86: Fix loading of floating point constants
This changeset actually fixes two issues:
* The lfpimm instruction didn't work correctly when applied to a floating point constant (it did work for integers containing the bit string representation of a constant) since it used reinterpret_cast to convert a double to a uint64_t. This caused a compilation error, at least, in gcc 4.6.3.
* The instructions loading floating point constants in the x87 processor didn't work correctly since they just stored a truncated integer instead of a double in the floating point register. This changeset fixes the old microcode by using lfpimm instruction instead of the limm instructions. |
9761:f2102d45a753 |
18-Jun-2013 |
Andreas Sandberg <andreas@sandberg.pp.se> |
x86: Make fprem like the fprem on a real x87
The current implementation of fprem simply does an fmod and doesn't simulate any of the iterative behavior in a real fprem. This isn't normally a problem, however, it can lead to problems when switching between CPU models. If switching from a real CPU in the middle of an fprem loop to a simulated CPU, the output of the fprem loop becomes correupted. This changeset changes the fprem implementation to work like the one on real hardware. |
9758:353587055aff |
18-Jun-2013 |
Andreas Sandberg <andreas@sandberg.pp.se> |
x86: Fix the flag handling code in FABS and FCHS
This changeset fixes two problems in the FABS and FCHS implementation. First, the ISA parser expects the assignment in flag_code to be a pure assignment and not an and-assignment, which leads to the isa_parser omitting the misc reg update. Second, the FCHS and FABS macro-ops don't set the SetStatus flag, which means that the default micro-op version, which doesn't update FSW, is executed. |
9699:76828cbe5de4 |
21-May-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
x86: add op class for int and fp microops in isa description Currently all the integer microops are marked as IntAluOp and the floating point microops are marked as FloatAddOp. This patch adds support for marking different microops differently. Now IntMultOp, IntDivOp, FloatDivOp, FloatMultOp, FloatCvtOp, FloatSqrtOp classes will be used as well. This will help in providing different latencies for different op class. |
9582:0632d2d1575c |
11-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
x86: implement some of the x87 instructions This patch implements ftan, fprem, fyl2x, fld* floating-point instructions. |
9471:4193ed60eed7 |
15-Jan-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
x86: implements emms instruction |
9470:68f7e0bcf4aa |
15-Jan-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
x86: implement fabs, fchs instructions |
9371:7c1484cc9b10 |
30-Dec-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
x86: implement x87 fp instruction fsincos This patch implements the fsincos instruction. The code was originally written by Vince Weaver. Gabe had made some comments about the code, but those were never addressed. This patch addresses those comments. |
9212:dc386ccc1db9 |
11-Sep-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
X86: make use of register predication The patch introduces two predicates for condition code registers -- one tests if a register needs to be read, the other tests whether a register needs to be written to. These predicates are evaluated twice -- during construction of the microop and during its execution. Register reads and writes are elided depending on how the predicates evaluate. |
9211:46c3a74952ec |
11-Sep-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
x86: Add a separate register for D flag bit The D flag bit is part of the cc flag bit register currently. But since it is not being used any where in the implementation, it creates an unnecessary dependency. Hence, it is being moved to a separate register. |
9010:7891b96e1526 |
22-May-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
X86: Split Condition Code register This patch moves the ECF and EZF bits to individual registers (ecfBit and ezfBit) and the CF and OF bits to cfofFlag registers. This is being done so as to lower the read after write dependencies on the the condition code register. Ultimately we will have the following registers [ZAPS], [OF], [CF], [ECF], [EZF] and [DF]. Note that this is only one part of the solution for lowering the dependencies. The other part will check whether or not the condition code register needs to be actually read. This would be done through a separate patch. |
8946:fb6c89334b86 |
14-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
clang/gcc: Fix compilation issues with clang 3.0 and gcc 4.6
This patch addresses a number of minor issues that cause problems when compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it avoids using the deprecated ext/hash_map and instead uses unordered_map (and similarly so for the hash_set). To make use of the new STL containers, g++ and clang has to be invoked with "-std=c++0x", and this is now added for all gcc versions >= 4.6, and for clang >= 3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1 unordered_map to avoid the deprecation warning.
The addition of c++0x in turn causes a few problems, as the compiler is more stringent and adds a number of new warnings. Below, the most important issues are enumerated:
1) the use of namespaces is more strict, e.g. for isnan, and all headers opening the entire namespace std are now fixed.
2) another other issue caused by the more stringent compiler is the narrowing of the embedded python, which used to be a char array, and is now unsigned char since there were values larger than 128.
3) a particularly odd issue that arose with the new c++0x behaviour is found in range.hh, where the operator< causes gcc to complain about the template type parsing (the "<" is interpreted as the beginning of a template argument), and the problem seems to be related to the begin/end members introduced for the range-type iteration, which is a new feature in c++11.
As a minor update, this patch also fixes the build flags for the clang debug target that used to be shared with gcc and incorrectly use "-ggdb". |
8925:97f06a79b6f5 |
31-Mar-2012 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix address size handling so real mode works properly.
Virtual (pre-segmentation) addresses are truncated based on address size, and any non-64 bit linear address is truncated to 32 bits. This means that real mode addresses aren't truncated down to 16 bits after their segment bases are added in. |
8857:120adc5a4345 |
26-Feb-2012 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Use the M5PanicFault fault in execute methods instead of calling panic.
If an instruction is executed speculatively and hits a situation where it wants to panic, it should return a fault instead. If the instruction was misspeculated, the fault can be thrown away. If the instruction wasn't misspeculated, the fault will be invoked and the panic will still happen. |
8626:19eed0015983 |
01-Dec-2011 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix a bad segmentation check for the stack segment. |
8610:9bdd52a2214c |
03-Nov-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
x86: Add microop for fence This patch adds a new microop for memory barrier. The microop itself does nothing, but since it is marked as a memory barrier, the O3 CPU should flush all the pending loads and stores before the fence to the memory system. |
8607:5fb918115c07 |
31-Oct-2011 |
Gabe Black <gblack@eecs.umich.edu> |
GCC: Get everything working with gcc 4.6.1.
And by "everything" I mean all the quick regressions. |
8588:ef28ed90449d |
27-Sep-2011 |
Gabe Black <gblack@eecs.umich.edu> |
ISA parser: Use '_' instead of '.' to delimit type modifiers on operands.
By using an underscore, the "." is still available and can unambiguously be used to refer to members of a structure if an operand is a structure, class, etc. This change mostly just replaces the appropriate "."s with "_"s, but there were also a few places where the ISA descriptions where handling the extensions themselves and had their own regular expressions to update. The regular expressions in the isa parser were updated as well. It also now looks for one of the defined type extensions specifically after connecting "_" where before it would look for any sequence of characters after a "." following an operand name and try to use it as the extension. This helps to disambiguate cases where a "_" may legitimately be part of an operand name but not separate the name from the type suffix.
Because leaving the "_" and suffix on the variable name still leaves a valid C++ identifier and all extensions need to be consistent in a given context, I considered leaving them on as a breadcrumb that would show what the intended type was for that operand. Unfortunately the operands can be referred to in code templates, the Mem operand in particular, and since the exact type of Mem can be different for different uses of the same template, that broke things. |
8442:b1f3dfae06f1 |
03-Jul-2011 |
Gabe Black <gblack@eecs.umich.edu> |
ISA: Use readBytes/writeBytes for all instruction level memory operations. |
8440:e513600a3551 |
03-Jul-2011 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix store microops so they don't drop faults in timing mode.
If a fault was returned by the CPU when a store initiated it's write, the store instruction would ignore the fault. This change fixes that. |
8432:4a0c9c9409e4 |
21-Jun-2011 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Eliminate an unused argument for building store microops. |
8103:53c2d9b1c15d |
02-Mar-2011 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Mark IO reads and writes as non-speculative. |
8102:77ee9ad2e113 |
02-Mar-2011 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Mark prefetches as such in their instruction and request flags. |
7975:4ddb6f13cf13 |
15-Feb-2011 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Get rid of "inline" on the MicroPanic constructor in decoder.cc.
This was making certain versions of gcc omit the function from the object file which would break the build. |
7969:068f061e57a8 |
13-Feb-2011 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Put the result used for flags in an intermediate variable.
Using the destination register directly causes the ISA parser to treat it as a source even if none of the original bits are used. |
7967:b243dc8cec8b |
13-Feb-2011 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Don't read in dest regs if all bits are replaced.
In x86, 32 and 64 bit writes to registers in which registers appear to be 32 or 64 bits wide overwrite all bits of the destination register. This change removes false dependencies in these cases where the previous value of a register doesn't need to be read to write a new value. New versions of most microops are created that have a "Big" suffix which simply overwrite their destination, and the right version to use is selected during microop allocation based on the selected data size.
This does not change the performance of the O3 CPU model significantly, I assume because there are other false dependencies from the condition code bits in the flags register. |
7965:f4c89fe1246b |
13-Feb-2011 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Define fault objects to carry debug messages.
These faults can panic/warn/warn_once, etc., instead of instructions doing that themselves directly. That way, instructions can be speculatively executed, and only if they're actually going to commit will their fault be invoked and the panic, etc., happen. |
7894:48d31b577847 |
07-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
x86: set IsCondControl flag for the appropriate microops |
7874:c7f15c60898e |
02-Feb-2011 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Get rid of the stupd microop. |
7789:f455790bcd47 |
08-Dec-2010 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Take advantage of new PCState syntax. |
7720:65d338a8dba4 |
31-Oct-2010 |
Gabe Black <gblack@eecs.umich.edu> |
ISA,CPU,etc: Create an ISA defined PC type that abstracts out ISA behaviors.
This change is a low level and pervasive reorganization of how PCs are managed in M5. Back when Alpha was the only ISA, there were only 2 PCs to worry about, the PC and the NPC, and the lsb of the PC signaled whether or not you were in PAL mode. As other ISAs were added, we had to add an NNPC, micro PC and next micropc, x86 and ARM introduced variable length instruction sets, and ARM started to keep track of mode bits in the PC. Each CPU model handled PCs in its own custom way that needed to be updated individually to handle the new dimensions of variability, or, in the case of ARMs mode-bit-in-the-pc hack, the complexity could be hidden in the ISA at the ISA implementation's expense. Areas like the branch predictor hadn't been updated to handle branch delay slots or micropcs, and it turns out that had introduced a significant (10s of percent) performance bug in SPARC and to a lesser extend MIPS. Rather than perpetuate the problem by reworking O3 again to handle the PC features needed by x86, this change was introduced to rework PC handling in a more modular, transparent, and hopefully efficient way.
PC type:
Rather than having the superset of all possible elements of PC state declared in each of the CPU models, each ISA defines its own PCState type which has exactly the elements it needs. A cross product of canned PCState classes are defined in the new "generic" ISA directory for ISAs with/without delay slots and microcode. These are either typedef-ed or subclassed by each ISA. To read or write this structure through a *Context, you use the new pcState() accessor which reads or writes depending on whether it has an argument. If you just want the address of the current or next instruction or the current micro PC, you can get those through read-only accessors on either the PCState type or the *Contexts. These are instAddr(), nextInstAddr(), and microPC(). Note the move away from readPC. That name is ambiguous since it's not clear whether or not it should be the actual address to fetch from, or if it should have extra bits in it like the PAL mode bit. Each class is free to define its own functions to get at whatever values it needs however it needs to to be used in ISA specific code. Eventually Alpha's PAL mode bit could be moved out of the PC and into a separate field like ARM.
These types can be reset to a particular pc (where npc = pc + sizeof(MachInst), nnpc = npc + sizeof(MachInst), upc = 0, nupc = 1 as appropriate), printed, serialized, and compared. There is a branching() function which encapsulates code in the CPU models that checked if an instruction branched or not. Exactly what that means in the context of branch delay slots which can skip an instruction when not taken is ambiguous, and ideally this function and its uses can be eliminated. PCStates also generally know how to advance themselves in various ways depending on if they point at an instruction, a microop, or the last microop of a macroop. More on that later.
Ideally, accessing all the PCs at once when setting them will improve performance of M5 even though more data needs to be moved around. This is because often all the PCs need to be manipulated together, and by getting them all at once you avoid multiple function calls. Also, the PCs of a particular thread will have spatial locality in the cache. Previously they were grouped by element in arrays which spread out accesses.
Advancing the PC:
The PCs were previously managed entirely by the CPU which had to know about PC semantics, try to figure out which dimension to increment the PC in, what to set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction with the PC type itself. Because most of the information about how to increment the PC (mainly what type of instruction it refers to) is contained in the instruction object, a new advancePC virtual function was added to the StaticInst class. Subclasses provide an implementation that moves around the right element of the PC with a minimal amount of decision making. In ISAs like Alpha, the instructions always simply assign NPC to PC without having to worry about micropcs, nnpcs, etc. The added cost of a virtual function call should be outweighed by not having to figure out as much about what to do with the PCs and mucking around with the extra elements.
One drawback of making the StaticInsts advance the PC is that you have to actually have one to advance the PC. This would, superficially, seem to require decoding an instruction before fetch could advance. This is, as far as I can tell, realistic. fetch would advance through memory addresses, not PCs, perhaps predicting new memory addresses using existing ones. More sophisticated decisions about control flow would be made later on, after the instruction was decoded, and handed back to fetch. If branching needs to happen, some amount of decoding needs to happen to see that it's a branch, what the target is, etc. This could get a little more complicated if that gets done by the predecoder, but I'm choosing to ignore that for now.
Variable length instructions:
To handle variable length instructions in x86 and ARM, the predecoder now takes in the current PC by reference to the getExtMachInst function. It can modify the PC however it needs to (by setting NPC to be the PC + instruction length, for instance). This could be improved since the CPU doesn't know if the PC was modified and always has to write it back.
ISA parser:
To support the new API, all PC related operand types were removed from the parser and replaced with a PCState type. There are two warts on this implementation. First, as with all the other operand types, the PCState still has to have a valid operand type even though it doesn't use it. Second, using syntax like PCS.npc(target) doesn't work for two reasons, this looks like the syntax for operand type overriding, and the parser can't figure out if you're reading or writing. Instructions that use the PCS operand (which I've consistently called it) need to first read it into a local variable, manipulate it, and then write it back out.
Return address stack:
The return address stack needed a little extra help because, in the presence of branch delay slots, it has to merge together elements of the return PC and the call PC. To handle that, a buildRetPC utility function was added. There are basically only two versions in all the ISAs, but it didn't seem short enough to put into the generic ISA directory. Also, the branch predictor code in O3 and InOrder were adjusted so that they always store the PC of the actual call instruction in the RAS, not the next PC. If the call instruction is a microop, the next PC refers to the next microop in the same macroop which is probably not desirable. The buildRetPC function advances the PC intelligently to the next macroop (in an ISA specific way) so that that case works.
Change in stats:
There were no change in stats except in MIPS and SPARC in the O3 model. MIPS runs in about 9% fewer ticks. SPARC runs with 30%-50% fewer ticks, which could likely be improved further by setting call/return instruction flags and taking advantage of the RAS.
TODO:
Add != operators to the PCState classes, defined trivially to be !(a==b). Smooth out places where PCs are split apart, passed around, and put back together later. I think this might happen in SPARC's fault code. Add ISA specific constructors that allow setting PC elements without calling a bunch of accessors. Try to eliminate the need for the branching() function. Factor out Alpha's PAL mode pc bit into a separate flag field, and eliminate places where it's blindly masked out or tested in the PC. |
7719:f299139501f7 |
29-Oct-2010 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fault on divide by zero instead of panicing. |
7682:37c56be05af0 |
14-Sep-2010 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the halt microop non-speculative.
Executing this microop makes the CPU halt even if it was misspeculated. |
7626:bdd926760470 |
23-Aug-2010 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Get rid of the flagless microop constructor.
This will reduce clutter in the source and hopefully speed up compilation. |
7620:3d8a23caa1ef |
23-Aug-2010 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Consolidate extra microop flags into one parameter.
This single parameter replaces the collection of bools that set up various flavors of microops. A flag parameter also allows other flags to be set like the serialize before/after flags, etc., without having to change the constructor. |
7571:405f840c4ae1 |
22-Aug-2010 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Get rid of the unused getAllocator on the python base microop class.
This function is always overridden, and doesn't actually have the right signature. |
7480:6a854784be4f |
25-Jun-2010 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix div2 flag calculation. |
7087:fb8d5786ff30 |
24-May-2010 |
Nathan Binkert <nate@binkert.org> |
copyright: Change HP copyright on x86 code to be more friendly |
7081:ff2321547ca3 |
12-May-2010 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the cvti2f microop sign extend its integer source correctly.
The code was using the wrong bit as the sign bit. Other similar bits of code seem to be correct. |
7080:c52c581277bf |
12-May-2010 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Actual change that fixes div. How did that happen? |
7070:abdcb0389716 |
02-May-2010 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Finally fix a division corner case.
When doing an unsigned 64 bit division with a divisor that has its most significant bit set, the division code would spill a bit off of the end of a uint64_t trying to shift the dividend into position. This change adds code that handles that case specially by purposefully letting it spill and then going ahead assuming there was a 65th one bit. |
6801:353726c415f4 |
19-Dec-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add a common named flag for signed media operations. |
6800:335f8b406bb9 |
19-Dec-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Create a common flag with a name to indicate high multiplies. |
6799:36131e4dfb6e |
19-Dec-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Create a common flag with a name to indicate scalar media instructions. |
6742:a2a79fe9655d |
11-Nov-2009 |
Vince Weaver <vince@csl.cornell.edu> |
X86: add ULL to 1's being shifted in 64-bit values
Some of the micro-ops weren't casting 1 to ULL before shifting, which can cause problems. On the perl makerand input this caused some values to be negative that shouldn't have been.
The casts are done as ULL(1) instead of 1ULL to match others in the m5 code base. |
6736:530e457c88c7 |
09-Nov-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make x86 use PREFETCH instead of PF_EXCLUSIVE. |
6732:4b93003bb069 |
10-Nov-2009 |
Vince Weaver <vince@csl.cornell.edu> |
X86: Remove double-cast in Cvtf2i micro-op
This double cast led to rounding errors which caused some benchmarks to get the wrong values, most notably lucas which failed spectacularly due to CVTTSD2SI returning an off-by-one value. equake was also broken. |
6647:5a9fd91b66a3 |
16-Sep-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Sign extend the immediate of wripi like the register version. |
6646:d9c23fff4f13 |
16-Sep-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the imm8 member of immediate microops really 8 bits consistently. |
6624:b157ef23d76c |
23-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Preserve the NO_ACCESS flag when giving CDA a specialized interface. |
6622:aff9a522956a |
21-Aug-2009 |
Nathan Binkert <nate@binkert.org> |
X86: fix some simple compile issues static should not be used for constants that are not inside a class definition. |
6605:e16cf917dcec |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a microop for converting fp values to ints. |
6603:b3333ef98685 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a microop that compares fp values and writes a mask as a result. |
6601:457527e517cc |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a microop that compares fp values and writes to rflags. |
6596:e60eaef99523 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a shuffle media microop. |
6594:a5dbea7ba3f9 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a mask move microop. |
6592:0143f8c4b2c2 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a microop that moves sign bits. |
6589:7b0f907855d5 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Extend mov2int and mov2fp so they can support insert and extract instructions. |
6587:1cb6f8b427c0 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media average microop. |
6585:0eab2a19847a |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Let the integer multiply microop use every other possible source value. |
6583:04df43def004 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the media shift microops. These don't handle full 128 bit wide shifts. |
6581:e0f289b84a4b |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a "sum of absolute differences" microop. |
6579:26d371ccd503 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement an integer media subtract microop. |
6577:cfe4a8f16e5f |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media integer multiply microop. |
6574:991d265901cc |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement an integer media max microop. |
6572:b0cef5e2dfdb |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add a media integer min microop. |
6570:d7907eaf7419 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement an integer media addition microop with optional saturation. |
6568:a34aae12095c |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media microop that converts between floating point data types. |
6566:c246dc2ec640 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a microop that compares fp values and writes a mask as its result. |
6562:571fd8d89903 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media microop for converting integer values to floating point. |
6560:323d48647000 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a floating point media divide microop. |
6558:8f37a2946cc3 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a floating point media multiply microop. |
6556:0e597fe2b391 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media floating point subtract microop. |
6554:22cb3c1ea3fb |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a floating point media add microop. |
6552:fa0ea492a075 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media square root microop. |
6550:9754d16c242c |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the floating point media max microop. |
6548:130e3dd23eab |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a floating point media min microop. |
6546:c7e724c1570f |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Create a pack media microop. |
6545:9c68aea7b1e6 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Rename sel to ext for media microops. |
6541:f70ee159db59 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a multimedia andn microop. |
6539:df1ebe278239 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a multimedia and microop. |
6537:bebbb828a363 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media or microop. |
6534:0943f0e54f0f |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media xor microop. |
6524:e207990ddd14 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the lfpimm microop. |
6521:ff5e7e6bcfbd |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement an unpack microop. |
6516:b5b420d15a20 |
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Set up a media microop framework and create mov2int and mov2fp microops. |
6482:e4b8ec60fd4b |
08-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make not taken conditional moves leave the destination alone. Adjust CMOVcc. The manuals from both AMD and Intel say that when writing to a 32 bit destination in 64 bit mode, the upper 32 bits of the register are filled with zeros. They also both say that the CMOV instructions leave their destination alone when their condition fails. Unfortunately, it seems that CMOV will zero extend its destination register whether or not it was supposed to actually do a move on both platforms. This seems to be the only case where this happens, but it would be hard to say for sure. |
6479:b9ab1b56391b |
07-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement shift right/left double microops. This is my best guess as far as what these should do. Other existing microops use implicit registers, mul1s and mul1u for instance, so this should be ok. The microop that loads the implicit DoubleBits register would fall into one of the microop slots for moving to/from special registers. |
6464:2529aeaf1a1c |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make conditional moves zero extend their 32 bit destinations always. |
6463:fe6165923529 |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix condition code setting for signed multiplies with negative results. |
6462:209c3818a863 |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the check for negative operands for sign multiply more direct. |
6461:418145f4d7a6 |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make sure immediate values are truncated properly. Register values will be "picked" which will assure they don't have junk beyond the part we're using. Immediate values don't go through a similar process, so we should truncate them explicitly. |
6456:57e6d35dde10 |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Handle rotate left with carry instructions that go all the way around or more. |
6454:755cf9b6185f |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Handle rotate right with carry instructions that go all the way around or more. |
6453:1d4dbb357560 |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the overflow bit for rotate right with carry. |
6452:751b06abbaae |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the computation of the bottom part of rotate right with carry. |
6451:fc096f28bcd2 |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the computation of the upper part of rotate right with carry. |
6449:a7a428f403da |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Handle rotating right all the way around or more. |
6447:eebbe9f1bf10 |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make shifts/rotations that write to 32 bits of a register zero extend. |
6446:cc8568cfce8f |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Handle left rotations that go all the way around or more. |
6444:8e72cf8196cc |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the sar carry flag. |
6443:fa4e81c993d0 |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix sign extension when doing an arithmetic shift right by 0. |
6442:580a6fbc7585 |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the carry flag for shr. |
6441:801f1fc07a58 |
05-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the carry flag for shl. |
6430:4c5671ecceda |
02-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the high result of mul1s, and removed undefined shifts from the mult microops. |
6345:f9ae7c3a036c |
16-Jul-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Take limitted advantage of the compilers type checking for microop operands. |
6222:9ee4a06a960b |
29-May-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Keep track of more descriptor state to accomodate KVM. |
6132:916f10213bea |
23-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Put the StoreCheck flag with the others, and don't collide with other flags. |
6080:50890791c591 |
19-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the stul microop. This microop does a store and unlocks the requested address. The RISC86 microop ISA doesn't seem to have an equivalent to this, so I'm guessing that the store following an ldstl is automatically unlocking. We don't do it this way for performance reasons since the behavior is the same. |
6079:f39c5598a302 |
19-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the ldstl microop. This microop does a load, checks that a store would succeed, and locks the requested address. |
6060:3d524dc980a8 |
19-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement far jmp. |
6058:b62d79c1990b |
19-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix a bug in the chks microop where it ignored that it found a fault. |
6056:4435d13700de |
19-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: LEA calculates an address before segmentation. |
6047:bc8caab35dd0 |
19-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the halt microop. |
5969:815827deb469 |
27-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Take address size into account when computing an effective address. |
5965:71f8d7c12619 |
27-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix segment limit checks. |
5936:c30088a243ad |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add segmentation checks for ldt related descriptors and selectors. |
5935:df55109af564 |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the TSS type check actually return a fault if it fails. |
5934:367ac7cae7b5 |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make rdcr use merge and the mov to control register instructions use the right operand size. |
5932:afa0866171e1 |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the segment register reading microops use merge. |
5927:5e3367b103da |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Do a merge for the zero extension microop. |
5926:c182698e1ab3 |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add microops for reading/writing debug registers. |
5924:516eda09c743 |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Check src1 for illegal values since that's the index we actually use. |
5920:5a9c976270d6 |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a basic prefetch instruction. |
5919:08f836f37f61 |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Use the right portion of a register for stores. |
5912:d113f6def227 |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add a flag to force memory accesses to happen at CPL 0. |
5905:e342ab8f92fa |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add a wrattr microop. |
5901:76fc2c3e10d2 |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix segment limit checking. |
5900:6776001c9b92 |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add a check to chks to verify a task state segment descriptor. |
5899:b702f4fdf16c |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add a check to chks which raises #GP(selector) if selector is NULL or not in the GDT. |
5892:a0ef4a6349dc |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the stupd microop not update registers in initiateAcc. |
5890:bdef71accd68 |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
CPU: Get rid of translate... functions from various interface classes. |
5861:8c1aa74572e4 |
06-Feb-2009 |
Nathan Binkert <nate@binkert.org> |
Quell g++ 4.3 warning about operator ambiguity |
5857:8cd8e1393990 |
01-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the fault classes handle error codes better. |
5855:d4e54239ed37 |
01-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Distinguish between hardware and software interrupts/exceptions |
5853:606b9525071d |
01-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the chks microop check for the right int descriptor type. |
5788:6d4161a36ca1 |
07-Jan-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Autogenerate macroop generateDisassemble function. |
5727:8b9aaeac5bab |
10-Nov-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix completeAcc get call. |
5692:0d6addcde185 |
13-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Set the delayed commit flag in x86 microops appropriately. |
5682:6f1cab082ba7 |
13-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add wrval/rdval microops for reading significant miscregs. |
5679:0b7855e2b731 |
13-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make sure register microops set fault rather than returning one. |
5678:9af6981bb086 |
13-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement an wrdh microop which loads bases/offsets from 16 byte descriptors. |
5675:7828ee363019 |
12-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the chks check of interrupt gate target code segments. |
5674:4a4f20dfbc60 |
12-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add a check type for interrupt gates. |
5673:57be483cea36 |
12-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix chks checking the submode for stack segments. |
5672:f332946e12b2 |
12-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Let segment manipulation microops be conditional. |
5670:1df7cdfc4aa6 |
12-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the rdbase microop |
5667:78b94954f66a |
12-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Create a handy way to access labels from the ROM in microcode. |
5666:e7925fa8f0d6 |
12-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make X86's microcode ROM actually do something. |
5663:be5cb9485aed |
12-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Create an eret microop which returns from ROM to combinational decoding. |
5662:4f3371a1c58c |
12-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make Br never report itself as the last microop. |
5661:443e6f925027 |
12-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Create a SeqOp class of microops and make Br one of them. |
5591:b05a5c5452e0 |
09-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the debugging microops. The debug functions can't handle a string object format. |
5449:89b696c8b754 |
12-Jun-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the disassembly for halt conform with the other microops. |
5433:1b0b8e9ba6a9 |
12-Jun-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Change how segment loading is performed. |
5429:52dbcf7f7328 |
12-Jun-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Keep handy values like the operating mode in one register. |
5428:5a27fea50fee |
12-Jun-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Change what the microop chks does. Instead of computing the segment descriptor address, this now checks if a selector value/descriptor are legal for a particular purpose. |
5427:1c389acefeb9 |
12-Jun-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add a microop to read a segments attribute register. |
5426:0bdcc60ccc45 |
12-Jun-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add microops and supporting code to manipulate the whole rflags register. |
5425:4226f6c2d03c |
12-Jun-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add microops which panic, fatal, warn, and warn_once. |
5424:d4f80459ad5d |
12-Jun-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Truncate descriptors to 16 bits. |
5409:0343cd06df4f |
12-Jun-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add in some support for the tsc register. |
5359:8c6ff200e4c1 |
26-Feb-2008 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the INVLPG instruction and the TIA microop. |
5296:5caa774215cd |
02-Dec-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement mov from control register. |
5295:5268691561b4 |
02-Dec-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: First crack at far returns. This is grossly approximate. |
5294:7222bdaed33b |
02-Dec-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Reorganize segmentation and implement segment selector movs. |
5293:5ea2a6dc8f17 |
02-Dec-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the "fault" microop predicated. |
5291:5d38610cff05 |
02-Dec-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the lgdt instruction. |
5290:7dc3e8ee0a22 |
02-Dec-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement wrbase and wrlimit for loading pseudo descriptors. |
5246:21f29e99e021 |
13-Nov-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make microcode use presegmentation RIPs and the rest of m5 use post segmentation RIPS. |
5241:a6602acdd046 |
12-Nov-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the wrcr microop which writes a control register, and some control register work. |
5239:0920dfb94514 |
12-Nov-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Change the meaning of the sext and zext width operand, and make sext set zext if the sign bit is 0. |
5232:d3801ea2792e |
12-Nov-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Various fixes to indexing segmentation related registers |
5188:974af6059943 |
30-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Compile fixes for 32 bit/debug/opt. |
5178:8914ea55a0c6 |
22-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the cda microop which checks if an address is legal to write to. |
5175:ee904e392de2 |
21-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the stupd microop ("store with update", not "stupid") and use it in ENTER. |
5173:07204d59a328 |
19-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Impelement the HLT instruction and fix the "halt" microop. |
5172:4f0e76579e7c |
19-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a "halt" microop. |
5163:f08b480df4c3 |
19-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the "fault" microop predicated. |
5157:9c6c153af4b1 |
19-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make wrip sign extend its second operand. |
5149:356e00996637 |
12-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement MSR reads and writes and the wrsmr and rdmsr instructions. There are no priviledge checks, so these instructions will all work in all modes. |
5138:069bbeae1ef8 |
07-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Significantly filled out misc regs. |
5122:b0527f379eb5 |
03-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the movfp microop. |
5118:f1b1cb6d0fbe |
03-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the ldst microop and put it in existing microcode where appropriate. |
5116:91881e9404de |
03-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Get rid of a hack for ruflag which is no longer necessary. |
5115:fa8e5c5ab419 |
03-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Allow logic instructions to set ECF as well as CF. |
5083:49559a8060e8 |
19-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Move the fp microops to their own file with their own base classes in C++ and python. |
5076:956a475dddea |
13-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the shift and rotate instructions set the carry flag(s) and overflow flags like they're supposed to. |
5075:4ae876c5037d |
13-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Total overhaul of the division instructions and microops. |
5065:63321c544086 |
10-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Move a comment to be next to the code it describes. |
5063:8eb72b1bd3c6 |
06-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Rework the multiplication microops so that they work like they would in the patent. |
5062:4c98f8cdcc11 |
06-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make signed multiplication do something different from unsigned. |
5061:2ac90228c205 |
06-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make signed versions of partial register values available to microops. |
5060:28b30e3e428c |
06-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Correct how the hi portion of a product is computed. |
5059:33478a26f73e |
06-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add a square root microop and the SSE sqrt instruction. |
5058:be23162b7370 |
06-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add SSE comparison instructions and microops and move some FP microops to be with the other ones. |
5052:791ae1b04d72 |
05-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement an SSE xor microop and instruction. |
5051:6bdf2a0ae4fb |
05-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the movfp microop use FloatRegBits instead of FloatRegs. This fixes a problem where interpreting arbitrary bits as floating point would change what the value was. These values are legitimate because the fp registers could be used to move around arbitrary data. |
5047:4a3593bec248 |
05-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement some SSE fp microops and instructions. |
5046:da031ef02439 |
05-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add some SSE floating point/integer conversion microops. |
5042:bc2c08abe249 |
05-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix a corner case where mul would overwrite an original register value it still needed. |
5040:126e4510b5bb |
01-Sep-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Major rework of how regop microops are generated. The new implementation uses metaclass, and gives a lot more precise control with a lot less verbosity. The flags/no flags reg/imm variants are all handled by the same python class now which supplies a constructor to the right C++ class based on context. |
5032:17f771e6b2f2 |
29-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the sra microop to get the sign bit from the right operand. |
5028:b9d42ad1f94e |
29-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add an fp move microop. |
5027:e96b8a4f4d96 |
29-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add load and store microops that use the fp registers. |
5011:6333ea094184 |
26-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the Ruflag microop work correctly, and make the code a little clearer. |
5007:121fa5d20f59 |
26-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the sign extension microop so it extends zeros correctly. |
5002:1b540e93ad34 |
26-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Remove x86 code that attempted to fix misaligned accesses. |
4951:1b51fb0c3983 |
07-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Overhaul of ruflags to get it to work correctly. |
4950:f5f19784acf1 |
07-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make a microcode branch microop. Also some touch up for ruflag. |
4868:99d4946469a1 |
04-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement microops and instructions that manipulate the flags register. |
4867:2de05bc73640 |
04-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make 64 bit unaligned accesses work as well as the other sizes. There is a fundemental flaw in how unaligned accesses are supported, but this is still an improvement. |
4863:b6dacc9a39ff |
04-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Start implementing segmentation support. Make instructions observe segment prefixes, default segment rules, segment base addresses. Also fix some microcode and add sib and riprel "keywords" to the x86 specialization of the microassembler. |
4834:9480bde3ae6a |
01-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix for compilation bug with new cache code. |
4823:9bd81e315a34 |
30-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Remove a naming conflict between the register index parameters and the "picked" register values. |
4809:ee82bc15a483 |
30-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make instructions use pick, and implement/adjust some multiplication microops and instructions. |
4804:4a707cb7065b |
30-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make disassembly use the final register index. Add bits to indicate whether or not register indexes should be "folded". |
4798:85351424da98 |
29-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make logic instructions flag setting work. The instructions now ask for the appropriate flags to be set, and the microops do the "right thing" with the CF and OF flags, namely zero them. |
4792:ccab7ba2c6e5 |
29-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make limm use merge and allow overriding the data size. |
4767:5e55d650692e |
27-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add functions to read and write to an exec context. These functions take care of calling the thread contexts read and write functions with the right sized data type, and handle unaligned accesses. |
4766:a708d14c44bf |
27-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix carry calculation for subtraction based microops. The carry flag should be calculated using the -complement- of the second operand, not it's negation. The carry in which is part of computing the 2's complement may induce a carry, but if you've already caused the carry before you get the carry computing logic involved, it will miss it. |
4756:a7083c283274 |
24-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Make the shift and rotate microops mask the shift/rotate amount correctly. |
4733:b0785fa2d7b6 |
21-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Implement rotate with carry microops. |
4732:9fdd1a5ab692 |
21-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Fixed the distinction between far and near versions of jmp, call and ret. Implemented some shifts, rotates, and pushes. |
4728:d60b98171bef |
20-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Implement adc and sbb instructions and microops. |
4725:441c280b5936 |
20-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Define and fill out a lot of different instructions and instruction versions. Added two of the shift microops. |
4720:15cb65a86e5a |
20-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Make load and store ops use the appropriate sized data access. |
4714:5e9f906ea0a0 |
20-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Fix carry flag for subtracts, and clean up code slightly. |
4712:79b4c64296ce |
19-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
x86 fixes Make the emulation environment consider the rex prefix. Implement and hook in forms of j, jmp, cmp, syscall, movzx Added a format for an instruction to carry a call to the SE mode syscalls system Made memory instructions which refer to the rip do so directly Made the operand size overridable in the microassembly Made the "ext" field of register operations 16 bits to hold a sparse encoding of flags to set or conditions to predicate on Added an explicit "rax" operand for the syscall format Implemented syscall returns. |
4708:efa060dd6f3c |
18-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Make instructions that conditionally set registers set them to their old value if they don't actually execute. |
4706:4ede9a05bb42 |
18-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Make store microops actually store instead of load. |
4701:6086c14956da |
18-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Make the data size used by regops overridable in the microassembly. |
4696:459853ed322c |
18-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Add a generateDisassembly function to the MicroFault StaticInst. |
4693:ca44a1014212 |
17-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Make disassembled x86 register indices reflect their size. This doesn't handle high byte register accesses. It also highlights the fact that address size isn't actually being calculated, and that the size a microop uses needs to be overridable from the microassembly. |
4688:82d7cbf0e66d |
17-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Add in support for condition code flags. Some microops can set the condition codes, and some of them can be predicated on them. Some of the codes aren't implemented because it was unclear from the AMD patent what they actually did. They are used with string instructions, but they use variables IP, DTF, and SSTF which don't appear to be documented. |
4679:0b39fa8f5eb8 |
14-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Pull some hard coded base classes out of the isa description. |
4612:a29c0616839d |
21-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Add in code that lays the ground work for setting flags. |
4601:38c989d15fef |
20-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Make memory instructions work better, add more macroop implementations, add an lea microop, move EmulEnv into it's own .cc and .hh. |
4595:5162e9a7728c |
19-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
More faithfulness to what instructions should work in what modes, and added the MOVSXD instruction. |
4592:520664dfb26f |
19-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Make instructions that are illegal in 64 bit mode not do the wrong thing in 64 bit mode. Also add in more versions of PUSH and POP, and a version of near CALL. |
4590:5c3813b700a3 |
19-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Renovate the "fault" microop implementation. |
4587:2c9a2534a489 |
19-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Get rid of the immediate and displacement components of the EmulEnv struct and use them directly out of the instruction. The extra copies are conceptually realistic but are just innefficient as implemented. Also don't use the zeroeth microcode register for general storage since it's now the zero register, and implement a load and a store microops. |
4581:23166f771fa4 |
18-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Add in incomplete pick and merge functions which read and write pieces of registers, and fill out microcode disassembly. |
4576:31f715613103 |
14-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Fix limm. |
4561:ade4960f0832 |
13-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Move load/store microops into their own file. They still don't do anything, though. |
4560:d65c11cc31d7 |
13-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Fix the immediate version of register operations, and get their name to show up correctly. |
4539:6eeeea62b7c4 |
12-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Make microOp vs microop and macroOp vs macroop capitilization consistent.
src/arch/x86/isa/macroop.isa: Make microOp vs microop and macroOp vs macroop capitilization consistent. Also fill out the emulation environment handling a little more, and use an object to pass around output code. src/arch/x86/isa/microops/base.isa: Make microOp vs microop and macroOp vs macroop capitilization consistent. Also adjust python to C++ bool translation. |
4534:7035ff1aa521 |
08-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Fix the formatting on a comment. |
4528:f0b19ee67a7b |
08-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Big changes to use the new microcode assembler. |
4524:f051dcff22b3 |
04-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Make limm (load immediate) microop |
4519:f8da6b45573f |
04-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Reworking x86's microcode system. This is a work in progress, and X86 doesn't compile.
src/arch/x86/isa/decoder/one_byte_opcodes.isa: src/arch/x86/isa/macroop.isa: src/arch/x86/isa/main.isa: src/arch/x86/isa/microasm.isa: src/arch/x86/isa/microops/base.isa: src/arch/x86/isa/microops/microops.isa: src/arch/x86/isa/operands.isa: src/arch/x86/isa/microops/regop.isa: src/arch/x86/isa/microops/specop.isa: Reworking x86's microcode system |
4372:14d42d795242 |
10-Apr-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Include the new GenFault microop. |
4371:c5003760793e |
10-Apr-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Reworked x86 a bit |
4344:174e31456abe |
06-Apr-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Consolidated the microcode assembler to help separate it from more x86-centric stuff. |
4343:3f11bcf873b3 |
06-Apr-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Refactored the x86 isa description some more. There should be more seperation between x86 specific parts, and those parts which are implemented in the isa description but could eventually be moved elsewhere. |
4338:24d31b35bcf9 |
04-Apr-2007 |
Gabe Black <gblack@eecs.umich.edu> |
The process of going from an instruction definition to an instruction to be returned by the decoder has been fleshed out more. The following steps describe how an instruction implementation becomes a StaticInst.
1. Microops are created. These are StaticInsts use templates to provide a basic form of polymorphism without having to make the microassembler smarter. 2. An instruction class is created which has a "templated" microcode program as it's docstring. The template parameters are refernced with ^ following by a number. 3. An instruction in the decoder references an instruction template using it's mnemonic. The parameters to it's format end up replacing the placeholders. These parameters describe a source for an operand which could be memory, a register, or an immediate. It it's a register, the register index is used. If it's memory, eventually a load/store will be pre/postpended to the instruction template and it's destination register will be used in place of the ^. If it's an immediate, the immediate is used. Some operand types, specifically those that come from the ModRM byte, need to be decoded further into memory vs. register versions. This is accomplished by making the decode_block text for these instructions another case statement based off ModRM. 4. Once all of the template parameters have been handled, the instruction goes throw the microcode assembler which resolves labels and creates a list of python op objects. If an operand is a register, it uses a % prefix, an immediate uses $, and a label uses @. If the operand is just letters, numbers, and underscores, it can appear immediately after the prefix. If it's not, it can be encolsed in non nested {}s. 5. If there is a single "op" object (which corresponds to a single microop) the decoder is set up to return it directly. If not, a macroop wrapper is created around it.
In the future, I'm considering seperating the operand type specialization from the template substitution step. A problem this introduces is that either the template arguments need to be kept around for the specialization step, or they need to be re-extracted. Re-extraction might be the way to go so that the operand formats can be coded directly into the micro assembler template without having to pass them in as parameters. I don't know if that's actually useful, though.
src/arch/x86/isa/decoder/one_byte_opcodes.isa: src/arch/x86/isa/microasm.isa: src/arch/x86/isa/microops/microops.isa: src/arch/x86/isa/operands.isa: src/arch/x86/isa/microops/base.isa: Implemented polymorphic microops and changed around the microcode assembler syntax. |
4298:a92aab35e34e |
29-Mar-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Add code to generate register and immediate based integer op microop classes. |