#
13613:a19963be12ca |
|
20-Nov-2018 |
Gabe Black <gabeblack@google.com> |
x86: Stop using/defining some ISA specific register types.
These have been replaced with the generic RegVal type.
Change-Id: I75c1134212067dea43aa0903d813633e06f3d6c6 Reviewed-on: https://gem5-review.googlesource.com/c/14476 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com>
|
#
13611:c8b7847b4171 |
|
19-Nov-2018 |
Gabe Black <gabeblack@google.com> |
arch: cpu: Rename *FloatRegBits* to *FloatReg*.
Now that there's no plain FloatReg, there's no reason to distinguish FloatRegBits with a special suffix since it's the only way to read or write FP registers.
Change-Id: I3a60168c1d4302aed55223ea8e37b421f21efded Reviewed-on: https://gem5-review.googlesource.com/c/14460 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Gabe Black <gabeblack@google.com>
|
#
12707:7819f067a128 |
|
23-May-2018 |
Gabe Black <gabeblack@google.com> |
x86: Add op classes to the MediaOps.
The ISA parser had been assuming these microops were all FloatAddOp which is usually not correct.
Change-Id: Ic54881d16f16b50c3d6a8c74b94bff9ae3b1f43e Reviewed-on: https://gem5-review.googlesource.com/10541 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Tariq Azmy <tariqslayer01@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
|
#
12236:126ac9da6050 |
|
04-Nov-2017 |
Gabe Black <gabeblack@google.com> |
alpha,arm,mips,power,riscv,sparc,x86: Merge exec decl templates.
In the ISA instruction definitions, some classes were declared with execute, etc., functions outside of the main template because they had CPU specific signatures and would need to be duplicated with each CPU plugged into them. Now that the instructions always just use an ExecContext, there's no reason for those templates to be separate. This change folds those templates together.
Change-Id: I13bda247d3d1cc07c0ea06968e48aa5b4aace7fa Reviewed-on: https://gem5-review.googlesource.com/5401 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Alec Roelke <ar4jc@virginia.edu> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
12234:78ece221f9f5 |
|
02-Nov-2017 |
Gabe Black <gabeblack@google.com> |
alpha,arm,mips,power,riscv,sparc,x86,isa: De-specialize ExecContexts.
The ISA parser used to generate different copies of exec functions for each exec context class a particular CPU wanted to use. That's since been changed so that those functions take a pointer to the base ExecContext, so the code which would generate those extra functions can be removed, and some functions which used to be templated on an ExecContext subclass can be untemplated, or minimally less templated.
Now that some functions aren't going to be instantiated multiple times with different signatures, there are also opportunities to collapse templates and make many instruction definitions simpler within the parser. Since those changes will be less mechanical, they're left for later changes and will probably be done in smaller increments.
Change-Id: I0015307bb02dfb9c60380b56d2a820f12169ebea Reviewed-on: https://gem5-review.googlesource.com/5381 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
11711:8565c34ec32e |
|
21-Nov-2016 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
x86: fix issue with casting in Cvtf2i
UBSAN flags this operation because it detects that arg is being cast directly to an unsigned type, argBits. this patch fixes this by first casting the value to a signed int type, then reintrepreting the raw bits of the signed int into argBits.
|
#
11320:42ecb523c64a |
|
06-Feb-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
style: remove trailing whitespace
Result of running 'hg m5style --skip-all --fix-white -a'.
|
#
11160:10f28b61fcb1 |
|
06-Oct-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
x86: implement rcpps and rcpss SSE insts
These are packed single-precision approximate reciprocal operations, vector and scalar versions, respectively.
This code was basically developed by copying the code for sqrtps and sqrtss. The mrcp micro-op was simplified relative to msqrt since there are no double-precision versions of this operation.
|
#
10196:be0e1724eb39 |
|
09-May-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
arch: teach ISA parser how to split code across files
This patch encompasses several interrelated and interdependent changes to the ISA generation step. The end goal is to reduce the size of the generated compilation units for instruction execution and decoding so that batch compilation can proceed with all CPUs active without exhausting physical memory.
The ISA parser (src/arch/isa_parser.py) has been improved so that it can accept 'split [output_type];' directives at the top level of the grammar and 'split(output_type)' python calls within 'exec {{ ... }}' blocks. This has the effect of "splitting" the files into smaller compilation units. I use air-quotes around "splitting" because the files themselves are not split, but preprocessing directives are inserted to have the same effect.
Architecturally, the ISA parser has had some changes in how it works. In general, it emits code sooner. It doesn't generate per-CPU files, and instead defers to the C preprocessor to create the duplicate copies for each CPU type. Likewise there are more files emitted and the C preprocessor does more substitution that used to be done by the ISA parser.
Finally, the build system (SCons) needs to be able to cope with a dynamic list of source files coming out of the ISA parser. The changes to the SCons{cript,truct} files support this. In broad strokes, the targets requested on the command line are hidden from SCons until all the build dependencies are determined, otherwise it would try, realize it can't reach the goal, and terminate in failure. Since build steps (i.e. running the ISA parser) must be taken to determine the file list, several new build stages have been inserted at the very start of the build. First, the build dependencies from the ISA parser will be emitted to arch/$ISA/generated/inc.d, which is then read by a new SCons builder to finalize the dependencies. (Once inc.d exists, the ISA parser will not need to be run to complete this step.) Once the dependencies are known, the 'Environments' are made by the makeEnv() function. This function used to be called before the build began but now happens during the build. It is easy to see that this step is quite slow; this is a known issue and it's important to realize that it was already slow, but there was no obvious cause to attribute it to since nothing was displayed to the terminal. Since new steps that used to be performed serially are now in a potentially-parallel build phase, the pathname handling in the SCons scripts has been tightened up to deal with chdir() race conditions. In general, pathnames are computed earlier and more likely to be stored, passed around, and processed as absolute paths rather than relative paths. In the end, some of these issues had to be fixed by inserting serializing dependencies in the build.
Minor note: For the null ISA, we just provide a dummy inc.d so SCons is never compelled to try to generate it. While it seems slightly wrong to have anything in src/arch/*/generated (i.e. a non-generated 'generated' file), it's by far the simplest solution.
|
#
10184:bbfa3152bdea |
|
09-May-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
arch: remove inline specifiers on all inst constrs, all ISAs
With (upcoming) separate compilation, they are useless. Only link-time optimization could re-inline them, but ideally feedback-directed optimization would choose to do so only for profitable (i.e. common) instructions.
|
#
10042:d4405a6bcc5a |
|
27-Jan-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
x86: correct error in emms instruction.
|
#
9471:4193ed60eed7 |
|
15-Jan-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
x86: implements emms instruction
|
#
9010:7891b96e1526 |
|
22-May-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
X86: Split Condition Code register This patch moves the ECF and EZF bits to individual registers (ecfBit and ezfBit) and the CF and OF bits to cfofFlag registers. This is being done so as to lower the read after write dependencies on the the condition code register. Ultimately we will have the following registers [ZAPS], [OF], [CF], [ECF], [EZF] and [DF]. Note that this is only one part of the solution for lowering the dependencies. The other part will check whether or not the condition code register needs to be actually read. This would be done through a separate patch.
|
#
8946:fb6c89334b86 |
|
14-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
clang/gcc: Fix compilation issues with clang 3.0 and gcc 4.6
This patch addresses a number of minor issues that cause problems when compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it avoids using the deprecated ext/hash_map and instead uses unordered_map (and similarly so for the hash_set). To make use of the new STL containers, g++ and clang has to be invoked with "-std=c++0x", and this is now added for all gcc versions >= 4.6, and for clang >= 3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1 unordered_map to avoid the deprecation warning.
The addition of c++0x in turn causes a few problems, as the compiler is more stringent and adds a number of new warnings. Below, the most important issues are enumerated:
1) the use of namespaces is more strict, e.g. for isnan, and all headers opening the entire namespace std are now fixed.
2) another other issue caused by the more stringent compiler is the narrowing of the embedded python, which used to be a char array, and is now unsigned char since there were values larger than 128.
3) a particularly odd issue that arose with the new c++0x behaviour is found in range.hh, where the operator< causes gcc to complain about the template type parsing (the "<" is interpreted as the beginning of a template argument), and the problem seems to be related to the begin/end members introduced for the range-type iteration, which is a new feature in c++11.
As a minor update, this patch also fixes the build flags for the clang debug target that used to be shared with gcc and incorrectly use "-ggdb".
|
#
8588:ef28ed90449d |
|
27-Sep-2011 |
Gabe Black <gblack@eecs.umich.edu> |
ISA parser: Use '_' instead of '.' to delimit type modifiers on operands.
By using an underscore, the "." is still available and can unambiguously be used to refer to members of a structure if an operand is a structure, class, etc. This change mostly just replaces the appropriate "."s with "_"s, but there were also a few places where the ISA descriptions where handling the extensions themselves and had their own regular expressions to update. The regular expressions in the isa parser were updated as well. It also now looks for one of the defined type extensions specifically after connecting "_" where before it would look for any sequence of characters after a "." following an operand name and try to use it as the extension. This helps to disambiguate cases where a "_" may legitimately be part of an operand name but not separate the name from the type suffix.
Because leaving the "_" and suffix on the variable name still leaves a valid C++ identifier and all extensions need to be consistent in a given context, I considered leaving them on as a breadcrumb that would show what the intended type was for that operand. Unfortunately the operands can be referred to in code templates, the Mem operand in particular, and since the exact type of Mem can be different for different uses of the same template, that broke things.
|
#
7626:bdd926760470 |
|
23-Aug-2010 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Get rid of the flagless microop constructor.
This will reduce clutter in the source and hopefully speed up compilation.
|
#
7620:3d8a23caa1ef |
|
23-Aug-2010 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Consolidate extra microop flags into one parameter.
This single parameter replaces the collection of bools that set up various flavors of microops. A flag parameter also allows other flags to be set like the serialize before/after flags, etc., without having to change the constructor.
|
#
7081:ff2321547ca3 |
|
12-May-2010 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the cvti2f microop sign extend its integer source correctly.
The code was using the wrong bit as the sign bit. Other similar bits of code seem to be correct.
|
#
6801:353726c415f4 |
|
19-Dec-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add a common named flag for signed media operations.
|
#
6800:335f8b406bb9 |
|
19-Dec-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Create a common flag with a name to indicate high multiplies.
|
#
6799:36131e4dfb6e |
|
19-Dec-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Create a common flag with a name to indicate scalar media instructions.
|
#
6742:a2a79fe9655d |
|
11-Nov-2009 |
Vince Weaver <vince@csl.cornell.edu> |
X86: add ULL to 1's being shifted in 64-bit values
Some of the micro-ops weren't casting 1 to ULL before shifting, which can cause problems. On the perl makerand input this caused some values to be negative that shouldn't have been.
The casts are done as ULL(1) instead of 1ULL to match others in the m5 code base.
|
#
6732:4b93003bb069 |
|
10-Nov-2009 |
Vince Weaver <vince@csl.cornell.edu> |
X86: Remove double-cast in Cvtf2i micro-op
This double cast led to rounding errors which caused some benchmarks to get the wrong values, most notably lucas which failed spectacularly due to CVTTSD2SI returning an off-by-one value. equake was also broken.
|
#
6622:aff9a522956a |
|
21-Aug-2009 |
Nathan Binkert <nate@binkert.org> |
X86: fix some simple compile issues static should not be used for constants that are not inside a class definition.
|
#
6605:e16cf917dcec |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a microop for converting fp values to ints.
|
#
6603:b3333ef98685 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a microop that compares fp values and writes a mask as a result.
|
#
6601:457527e517cc |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a microop that compares fp values and writes to rflags.
|
#
6596:e60eaef99523 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a shuffle media microop.
|
#
6594:a5dbea7ba3f9 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a mask move microop.
|
#
6592:0143f8c4b2c2 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a microop that moves sign bits.
|
#
6589:7b0f907855d5 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Extend mov2int and mov2fp so they can support insert and extract instructions.
|
#
6587:1cb6f8b427c0 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media average microop.
|
#
6585:0eab2a19847a |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Let the integer multiply microop use every other possible source value.
|
#
6583:04df43def004 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the media shift microops. These don't handle full 128 bit wide shifts.
|
#
6581:e0f289b84a4b |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a "sum of absolute differences" microop.
|
#
6579:26d371ccd503 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement an integer media subtract microop.
|
#
6577:cfe4a8f16e5f |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media integer multiply microop.
|
#
6574:991d265901cc |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement an integer media max microop.
|
#
6572:b0cef5e2dfdb |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add a media integer min microop.
|
#
6570:d7907eaf7419 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement an integer media addition microop with optional saturation.
|
#
6568:a34aae12095c |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media microop that converts between floating point data types.
|
#
6566:c246dc2ec640 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a microop that compares fp values and writes a mask as its result.
|
#
6562:571fd8d89903 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media microop for converting integer values to floating point.
|
#
6560:323d48647000 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a floating point media divide microop.
|
#
6558:8f37a2946cc3 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a floating point media multiply microop.
|
#
6556:0e597fe2b391 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media floating point subtract microop.
|
#
6554:22cb3c1ea3fb |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a floating point media add microop.
|
#
6552:fa0ea492a075 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media square root microop.
|
#
6550:9754d16c242c |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement the floating point media max microop.
|
#
6548:130e3dd23eab |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a floating point media min microop.
|
#
6546:c7e724c1570f |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Create a pack media microop.
|
#
6545:9c68aea7b1e6 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Rename sel to ext for media microops.
|
#
6541:f70ee159db59 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a multimedia andn microop.
|
#
6539:df1ebe278239 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a multimedia and microop.
|
#
6537:bebbb828a363 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media or microop.
|
#
6534:0943f0e54f0f |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement a media xor microop.
|
#
6521:ff5e7e6bcfbd |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Implement an unpack microop.
|
#
6516:b5b420d15a20 |
|
17-Aug-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Set up a media microop framework and create mov2int and mov2fp microops.
|