Guest
On Monday, January 7, 2019 at 2:21:51 PM UTC-5, Weng Tianxiang wrote:
I think you missed the mark by a wide margin on this one. The logic needed for the clock gating is this...
elsif WState /= WState_NS then
This is not so trivial compared to the FSM itself, especially in an ASIC. I would estimate it is approximately the same amount of logic in general.
> 2. I don't know how CPU deals with its 100,000*4 FFs clocking scheme used in state machines for the Cache II control. If they don't care about the power saving or they have implemented some scheme in the implementation, my invention would be of few values, or otherwise it would be worth million of dollars.
For a patent to be valid it has to be non-obvious to a practitioner in the field. I don't know how this is non-obvious to someone in the field of CPU design. You may obtain a patent, but then lose a patent defense case in court. But again, I didn't think cell phones would take off and now I have two.
> 3. My post's purpose is to test if such invention is of any value, not about how to implement a state machine with clock gating function.
What exactly is your "invention"??? Clock gating is nothing new. It is applied to many parts of a CPU. Is your invention the idea of applying it to the individual FSMs in a CPU cache? So if someone instead applies it to groupings of FSMs in a CPU cache they will have worked around your patent.
> 4. After my application is published 3 months later I will immediately register and sell the application at http://www.ast.com/interested-in-selling-to-ast/. I know the website because Google refers to the website and indicates they are a member of the site. I expect that Intel, IBM, AMD, Apple may also be the members of the website. The site asks for the selling price during registration. So it is important for me to assess my invention's value properly.
What value have you assessed so far?
Clock enable and clock gating are not the same thing. Clock enable saves power by not changing the FF state, but if the FF input is the same as the output the state won't change anyway.
Here is something to consider. Clock gating saves power compared to clock enabling by reducing the power consumed in the clock tree. How much of the clock tree will you actually be gating with a fine grained approach? Clock trees are exponential structures with a multiplier for the fan out at each level. With this fine grain approach you are only saving power in the final level and in fact, may be adding a level if your clock gating control is at a finer resolution than the last level of clock drive.
Generally clock gating is used at a high level to gate the clock to sections of a chip. I expect it is seldom if ever used at a low level because the power saved is not optimal and the logic required is maximal.
Rick C.
-- Get 6 months of free supercharging
-- Tesla referral code - https://ts.la/richard11209
On Monday, January 7, 2019 at 5:11:05 AM UTC-8, KJ wrote:
On Saturday, January 5, 2019 at 8:23:43 PM UTC-5, Weng Tianxiang wrote:
In above situation each of the ~100,000 state machines with each having more than 10 states must have a clock gating function to save power consumption:
That is your unsubstantiated claim, not a fact.
when it will not change states on the next cycle, a clock pulse should not be generated to keep the state unchanged and save power consumption.
Any perceived lower power consumption has very, very little to do with the fact that the state does not change. A flip flop that is clocked but does not happen to change its output does not consume much power. The power is needed to charge/discharge the loads that are being driven. Any decreased power consumption would have to do with the decrease in power in generating the clock input to the flip flop. But shifting from a common clock to adding a gate that generates a clock probably does not lower power since the same number of clock signals are being generated. If the gated clock routing is a higher capacitive route then when using a free-running clock then you can consume more power. This is the result when trying to implement gated clocks in FPGA. ASIC will be different.
For an application implemented in a FPGA chip, the clock gating function may not be necessary because too few state machines are implemented in any normal application.
As I pointed out to you back in 2010 (I think), implementing what you describe in an FPGA results in an increase in power consumption. I provided you with all of the details for your sample design. The results of that analysis are not "because too few state machines are implemented", it is because gated clocks in FPGA use more power, not less. Again, that was with your sample design of that time which appears to be the same thing you are reusing here.
Actually I realized how to implement the power consumption scheme in VHDL as follows after the post is posted:
I noticed that you did not show the actual gating of the clock, only the apparent usage of a possibly free running clock.
a: process(clk)
begin
if rising_edge(clk) then
Also, the following 'elsif' is not necessary even though your comment says it is. No worries though, synthesis tools should optimize out the 'elsif' and leave the assignment 'WState <= WState_NS;' on every clock. If the tool somehow leaves it in, then there will be an increase in power consumption due to use of additional logic required to implement 'elsif WState /= WState_NS then'. That increase would need to be counted against any power savings that you think you're achieving. Again, it would probably be worthwhile for you to do some analysis prior to posting and claiming...but after all these years of not acting on this advice it doesn't appear that you're willing to make that behavioral change.
elsif WState /= WState_NS then -- WState /= WState_NS is necessary!
WState <= WState_NS;
end if;
end if;
end process;
I suspect that you did not actually test any of this prior to posting and claiming since the code is not complete and does not compile...as usual.
Kevin
Hi,
There are several experts responding to my post. Thank you. Noticeably I do not find Hans of www.ht-lab.com giving his opinion. Usually his opinion is reasonable and informative and he knows many things outside the FPGA chips beyond my knowledge.
Here is the background for the purpose of my post:
1. On 12/31/2018 I filed a non-provisional patent application. I asked for earlier publication. The publication will happen about 14 weeks later since its filing date.
2. On 01/06/2019 I sent it in almost the same version as a regular paper to IEEE Transaction of circuits and System for publication. The review process may take up to 3 months.
Because IEEE Transaction strict restriction on the paper's originality, I cannot disclose any details about my invention until the transaction agrees to publish my paper 3 months later or rejects my paper in 1 or 2 weeks.
Here are some facts of my invention:
1. The logic used to generate a state machine with clock gating devices is almost the same as conventional method would generate, or maybe even simpler than conventional method.
I think you missed the mark by a wide margin on this one. The logic needed for the clock gating is this...
elsif WState /= WState_NS then
This is not so trivial compared to the FSM itself, especially in an ASIC. I would estimate it is approximately the same amount of logic in general.
> 2. I don't know how CPU deals with its 100,000*4 FFs clocking scheme used in state machines for the Cache II control. If they don't care about the power saving or they have implemented some scheme in the implementation, my invention would be of few values, or otherwise it would be worth million of dollars.
For a patent to be valid it has to be non-obvious to a practitioner in the field. I don't know how this is non-obvious to someone in the field of CPU design. You may obtain a patent, but then lose a patent defense case in court. But again, I didn't think cell phones would take off and now I have two.
> 3. My post's purpose is to test if such invention is of any value, not about how to implement a state machine with clock gating function.
What exactly is your "invention"??? Clock gating is nothing new. It is applied to many parts of a CPU. Is your invention the idea of applying it to the individual FSMs in a CPU cache? So if someone instead applies it to groupings of FSMs in a CPU cache they will have worked around your patent.
> 4. After my application is published 3 months later I will immediately register and sell the application at http://www.ast.com/interested-in-selling-to-ast/. I know the website because Google refers to the website and indicates they are a member of the site. I expect that Intel, IBM, AMD, Apple may also be the members of the website. The site asks for the selling price during registration. So it is important for me to assess my invention's value properly.
What value have you assessed so far?
5. I think no developing persons at Intel, IBM, AMD, Apple would visit this website, not mention taking part in the discussion of my post.
6. I hope I will discuss the invention in more details 3 months later before my registrations in the patent selling website.
7. Xilinx chip has clock enable signal built into its cell block, one CE input for 8 registers in the block. Altera may be in the same situation. So clock enable is never a new thing and we don't have to pay attention to how the clock trees work. For a CPU design, in my opinion, logic design and clock tree design are 2 separated domains one after another, and logic designers never have to pay attention to the clock trees.
Clock enable and clock gating are not the same thing. Clock enable saves power by not changing the FF state, but if the FF input is the same as the output the state won't change anyway.
Here is something to consider. Clock gating saves power compared to clock enabling by reducing the power consumed in the clock tree. How much of the clock tree will you actually be gating with a fine grained approach? Clock trees are exponential structures with a multiplier for the fan out at each level. With this fine grain approach you are only saving power in the final level and in fact, may be adding a level if your clock gating control is at a finer resolution than the last level of clock drive.
Generally clock gating is used at a high level to gate the clock to sections of a chip. I expect it is seldom if ever used at a low level because the power saved is not optimal and the logic required is maximal.
Rick C.
-- Get 6 months of free supercharging
-- Tesla referral code - https://ts.la/richard11209