天道酬勤,学无止境

Distinct random time generation in the fixed interval

I'm trying to generate a random time between 8:00 AM and 8:00 PM for each row that is selected from a data set, however, I always get the same random value for each row – I want it to be different for each row.

Table schema & data:

╔══════╦════════════════╗
║  ID  ║  CREATED_DATE  ║
╠══════╬════════════════╣
║ ID/1 ║   26/04/2014   ║
║ ID/2 ║   26/04/2014   ║
║ ID/3 ║   26/04/2014   ║
║ ID/4 ║   26/04/2014   ║
║ ID/5 ║   26/04/2014   ║
╚══════╩════════════════╝

Сurrent SQL statement:

SELECT [ID]
     , MyFunction.dbo.AddWorkDays(14, [CREATED_DATE]) AS [New Date]
     , CONVERT(VARCHAR, DATEADD(MILLISECOND, CAST(43200000 * RAND() AS INT), CONVERT(TIME, '08:00')), 114) AS [New Time]
FROM [RandomTable]

Current results (same time for each row in the [New Time] column):

╔══════╦════════════════╦════════════════╗
║  ID  ║    New Date    ║    New Time    ║
╠══════╬════════════════╬════════════════╣
║ ID/1 ║   10/05/2014   ║    09:41:43    ║
║ ID/2 ║   10/05/2014   ║    09:41:43    ║
║ ID/3 ║   10/05/2014   ║    09:41:43    ║
║ ID/4 ║   10/05/2014   ║    09:41:43    ║
║ ID/5 ║   10/05/2014   ║    09:41:43    ║
╚══════╩════════════════╩════════════════╝

Desired results (different time for each row in the [New Time] column):

╔══════╦════════════════╦════════════════╗
║  ID  ║    New Date    ║    New Time    ║
╠══════╬════════════════╬════════════════╣
║ ID/1 ║   10/05/2014   ║    09:41:43    ║
║ ID/2 ║   10/05/2014   ║    15:05:23    ║
║ ID/3 ║   10/05/2014   ║    10:01:05    ║
║ ID/4 ║   10/05/2014   ║    19:32:45    ║
║ ID/5 ║   10/05/2014   ║    08:43:15    ║
╚══════╩════════════════╩════════════════╝

Any ideas on how to fix this? All of the above is just sample data – my real table has around 2800 records (not sure if that will make a difference to anyone's suggestions).

评论

Interpretation of Original Question:

The question states:

  • Generate a random time between 8:00 AM and 8:00 PM (i.e. a 12-hour window)
  • It should be different for each row (i.e. unique across all rows)
  • The real table has around 2800 records

Now factor in the following points:

  • Sample data shows only a single date
  • There are 86,400 seconds in 24 hours, hence 43,200 seconds in 12 hours

There is some ambiguity in the following areas:

  • What exactly is random within the context of "different for every row", given that truly random values cannot be guaranteed to be different for every row. In fact, truly random numbers could theoretically be the same for every row. So is the emphasis on "random" or "different"? Or are we really talking about different but not sequentially ordered (to give the appearance of randomness without actually being random)?
  • What if there are ever more than 2800 rows? What if there are 1 million rows?
  • If there can be more than 43,200 rows, how to handle "different for each row" (since it is not possible to have unique across all rows)?
  • Will the date ever vary? If so, are we really talking about "different for each row per date"?
  • If "different for each row per date":
    • Can the times for each date follow the same, non-sequential pattern? Or does the pattern need to differ per each date?
    • Will there ever be more than 43,200 rows for any particular date? If so, the times can only be unique per each set of 43,200 rows.

Given the information above, there are a few ways to interpret the request:

  1. Emphasis on "random": Dates and number of rows don't matter. Generate truly random times that are highly likely, but not guaranteed, to be unique using one of the three methods shown in the other answers:
    • @notulysses: RAND(CAST(NEWID() AS VARBINARY)) * 43200
    • @Steve Ford: ABS(CHECKSUM(NewId()) % 43201)
    • @Vladimir Baranov : CAST(43200000 * (CAST(CRYPT_GEN_RANDOM(4) as int) / 4294967295.0 + 0.5) as int)
  2. Emphasis on "different for each row", always <= 43,200 rows: If the number of rows never exceeds the number of available seconds, it is easy to guarantee unique times across all rows, regardless of same or different dates, and appear to be randomly ordered.
  3. Emphasis on "different for each row", could be > 43,200 rows: If the number of rows can exceed the number of available seconds, then it is not possible to guarantee uniqueness across all rows, but it would be possible to still guarantee uniqueness across rows of any particular date, provided that no particular date has > 43,200 rows.

Hence, I based my answer on the idea that:

  • Even if the number of rows for the O.P. never exceeds 2800, it is more likely that most others who are encountering a similar need for randomness would have a larger data set to work with (i.e. there could easily be 1 million rows, for any number of dates: 1, 5000, etc.)
  • Either the sample data is overly simplistic in using the same date for all 5 rows, or even if the date is the same for all rows in this particular case, in most other cases that is less likely to happen
  • Uniqueness is to be favored over Randomness
  • If there is a pattern to the "seemingly random" ordering of the seconds for each date, there should at least be a varying offset to the start of the sequence across the dates (when the dates are ordered sequentially) to give the appearance of randomness between any small grouping of dates.

Answer:

If the situation requires unique times, that cannot be guaranteed with any method of generating truly random values. I really like the use of CRYPT_GEN_RANDOM by @Vladimir Baranov, but it is nearly impossible to get a unique set of values generated:

DECLARE @Table TABLE (Col1 BIGINT NOT NULL UNIQUE);

INSERT INTO @Table (Col1)
    SELECT CONVERT(BIGINT, CRYPT_GEN_RANDOM(4))
    FROM [master].sys.objects so
    CROSS JOIN [master].sys.objects so2
    CROSS JOIN [master].sys.objects so3;
    -- 753,571 rows

Increasing the random value to 8 bytes does seem to work:

DECLARE @Table TABLE (Col1 BIGINT NOT NULL UNIQUE);

INSERT INTO @Table (Col1)
    SELECT CONVERT(BIGINT, CRYPT_GEN_RANDOM(8))
    FROM [master].sys.objects so
    CROSS JOIN [master].sys.objects so2
    CROSS JOIN [master].sys.objects so3;
    -- 753,571 rows

Of course, if we are generating down to the second, then there are only 86,400 of those. Reducing the scope seems to help as the following does occasionally work:

DECLARE @Table TABLE (Col1 BIGINT NOT NULL UNIQUE);

INSERT INTO @Table (Col1)
    SELECT TOP (86400) CONVERT(BIGINT, CRYPT_GEN_RANDOM(4))
    FROM [master].sys.objects so
    CROSS JOIN [master].sys.objects so2
    CROSS JOIN [master].sys.objects so3;

However, things get a bit trickier if the uniqueness needs per each day (which seems like a reasonable requirement of this type of project, as opposed to unique across all days). But a random number generator isn't going to know to reset at each new day.

If it is acceptable to merely have the appearance of being random, then we can guarantee uniqueness per each date without:

  • looping / cursor constructs
  • saving already used values in a table
  • using RAND(), NEWID(), or CRYPT_GEN_RANDOM()

The following solution uses the concept of Modular Multiplicative Inverses (MMI) which I learned about in this answer: generate seemingly random unique numeric ID in SQL Server . Of course, that question did not have a tightly-defined range of values like we have here with only 86,400 of them per day. So, I used a range of 86400 (as "Modulo") and tried a few "coprime" values (as "Integer") in an online calculator to get their MMIs:

  • 13 (MMI = 39877)
  • 37 (MMI = 51373)
  • 59 (MMI = 39539)

I use ROW_NUMBER() in a CTE, partitioned (i.e. grouped) by CREATED_DATE as a means of assigning each second of the day a value.

But, while the values generated for seconds 0, 1, 2, ... and so on sequentially will appear random, across different days that particular second will map to the same value. So, the second CTE (named "WhichSecond") shifts the starting point for each date by converting the date to an INT (which converts dates to a sequential offset from 1900-01-01) and then multiply by 101.

DECLARE @Data TABLE
(
  ID INT NOT NULL IDENTITY(1, 1),
  CREATED_DATE DATE NOT NULL
);

INSERT INTO @Data (CREATED_DATE) VALUES ('2014-10-05');
INSERT INTO @Data (CREATED_DATE) VALUES ('2014-10-05');
INSERT INTO @Data (CREATED_DATE) VALUES ('2014-10-05');
INSERT INTO @Data (CREATED_DATE) VALUES ('2014-10-05');
INSERT INTO @Data (CREATED_DATE) VALUES ('2014-10-05');
INSERT INTO @Data (CREATED_DATE) VALUES ('2015-03-15');
INSERT INTO @Data (CREATED_DATE) VALUES ('2016-10-22');
INSERT INTO @Data (CREATED_DATE) VALUES ('2015-03-15');

;WITH cte AS
(
  SELECT tmp.ID,
         CONVERT(DATETIME, tmp.CREATED_DATE) AS [CREATED_DATE],
         ROW_NUMBER() OVER (PARTITION BY tmp.CREATED_DATE ORDER BY (SELECT NULL))
                      AS [RowNum]
  FROM   @Data tmp
), WhichSecond AS
(
  SELECT cte.ID,
         cte.CREATED_DATE,
         ((CONVERT(INT, cte.[CREATED_DATE]) - 29219) * 101) + cte.[RowNum]
                      AS [ThisSecond]
  FROM   cte
)
SELECT parts.*,
       (parts.ThisSecond % 86400) AS [NormalizedSecond], -- wrap around to 0 when
                                                         -- value goes above 86,400
       ((parts.ThisSecond % 86400) * 39539) % 86400 AS [ActualSecond],
       DATEADD(
                 SECOND,
                 (((parts.ThisSecond % 86400) * 39539) % 86400),
                 parts.CREATED_DATE
              ) AS [DateWithUniqueTime]
FROM WhichSecond parts
ORDER BY parts.ID;

Returns:

ID  CREATED_DATE  ThisSecond  NormalizedSecond  ActualSecond  DateWithUniqueTime
1   2014-10-05    1282297     72697             11483         2014-10-05 03:11:23.000
2   2014-10-05    1282298     72698             51022         2014-10-05 14:10:22.000
3   2014-10-05    1282299     72699              4161         2014-10-05 01:09:21.000
4   2014-10-05    1282300     72700             43700         2014-10-05 12:08:20.000
5   2014-10-05    1282301     72701             83239         2014-10-05 23:07:19.000
6   2015-03-15    1298558      2558             52762         2015-03-15 14:39:22.000
7   2016-10-22    1357845     61845             83055         2016-10-22 23:04:15.000
8   2015-03-15    1298559      2559              5901         2015-03-15 01:38:21.000

If we want to only generate times between 8:00 AM and 8:00 PM, we only need to make a few minor adjustments:

  1. Change the range (as "Modulo") from 86400 to half of it: 43200
  2. Recalculate the MMI (can use the same "coprime" values as "Integer"): 39539 (same as before)
  3. Add 28800 to the second parameter of the DATEADD as an 8 hour offset

The result will be a change to just one line (since the others are diagnostic):

-- second parameter of the DATEADD() call
28800 + (((parts.ThisSecond % 43200) * 39539) % 43200)

Another means of shifting each day in a less predictable fashion would be to make use of RAND() by passing in the INT form of CREATED_DATE in the "WhichSecond" CTE. This would give a stable offset per each date since RAND(x) will return the same value y for the same value of x passed in, but will return a different value y for a different value of x passed in. Meaning:

RAND(1) = y1
RAND(2) = y2
RAND(3) = y3
RAND(2) = y2

The second time RAND(2) was called, it still returned the same value of y2 that it returned the first time it was called.

Hence, the "WhichSecond" CTE could be:

(
  SELECT cte.ID,
         cte.CREATED_DATE,
         (RAND(CONVERT(INT, cte.[CREATED_DATE])) * {some number}) + cte.[RowNum]
                      AS [ThisSecond]
  FROM   cte
)

The issue OP had while using just rand() is due to it's evaluation once per query.

From the documentation:

If seed is not specified, the SQL Server Database Engine assigns a seed value at random. For a specified seed value, the result returned is always the same.

Approach that is described below removes optimization and suppresses this behavior, so rand() is evaluated once per row:

dateadd( second
       , rand(cast(newid() as varbinary)) * 43200
       , cast('08:00:00' as time) )
  • newid() generates unique value of type uniqueidentifier;
  • the value is converted with cast to be used as seed in rand([seed]) function to generate a pseudo-random float value from 0 through 1, and as seed is always unique the returning value is unique too .

SQLFiddle

You could alternatively use:

SELECT DATEADD(s, ABS(CHECKSUM(NewId()) % 43201), CAST('08:00:00' AS Time))

The ABS(CHECKSUM(NewId()) % 43201) generates a random number between 0 and 43200. See Discussion here.

SQL Fiddle

MS SQL Server 2008 Schema Setup:

Query 1:

SELECT DATEADD(s, ABS(CHECKSUM(NewId()) % 43201), CAST('08:00:00' AS Time)) AS [RandomTime]
FROM 
( VALUES (1), (2), (3), (4), (5)
) Y(A)
CROSS JOIN
( VALUES (1), (2), (3), (4), (5)
) Z(A)

Results:

|    RANDOMTIME    |
|------------------|
| 16:51:58.0000000 |
| 10:42:44.0000000 |
| 14:01:38.0000000 |
| 13:33:51.0000000 |
| 18:00:51.0000000 |
| 11:29:03.0000000 |
| 10:21:14.0000000 |
| 16:38:27.0000000 |
| 09:55:37.0000000 |
| 13:21:13.0000000 |
| 11:29:37.0000000 |
| 10:57:49.0000000 |
| 14:56:42.0000000 |
| 15:33:11.0000000 |
| 18:49:45.0000000 |
| 16:23:28.0000000 |
| 09:00:05.0000000 |
| 09:20:01.0000000 |
| 11:26:23.0000000 |
| 15:26:23.0000000 |
| 10:38:44.0000000 |
| 11:46:30.0000000 |
| 16:00:59.0000000 |
| 09:29:18.0000000 |
| 09:09:19.0000000 |

There are several methods:

  • Generate a table with random numbers in advance and use it whenever needed. Or take this data from some reputable source.
  • Various combinations that use NEWID function to provide a seed for RAND. It should be used with caution, because there is no guarantee about distribution of NEWID values. One of the best methods to make it more or less uniformly distributed is via the CHECKSUM: RAND(CHECKSUM(NEWID())). The good thing about this method is that NEWID function is available since SQL Server 2000.
  • Instead of NEWID use, say, MD5 of some column as a seed for RAND: RAND(CHECKSUM(HASHBYTES('MD5', CAST(SomeID AS varbinary(4))))) Or simply row number: RAND(CHECKSUM(HASHBYTES('MD5', CAST(ROW_NUMBER() OVER(ORDER BY ...) AS varbinary(4))))). This method is available since at least SQL Server 2005. The primary difference from NEWID method is that you have full control over the random sequence. You can't control what NEWID returns and you can't restart the random sequence from the same number again. If you supply same sets of, say, row numbers using PARTITION BY you'll get same sets of random numbers. It may be useful in the cases when you need to use the same sequence of random numbers several times. It is possible to get the same random number for two different seeds. I tested it for row numbers from 1 to 1,000,000. MD5 of them are all different. CHECKSUM of MD5 result in 122 collisions. RAND of this CHECKSUM result in 246 collisions. When tested with row numbers from 1 to 100,000 CHECKSUM had 1 collision, RAND had 3 collisions.
  • Another possibility is to simply implement your own user-defined function in T-SQL that generates a random number using your preferred algorithm. In this case you have full control of everything. Usually pseudo random generators have to store their internal state between invocations, so you may end up with having a dedicated table that stores this data.
  • You can write your user-defined function using CLR. In this case you can implement your own generator, or use functions built-into .NET, like Random class, or RNGCryptoServiceProvider class.
  • At last, since SQL Server 2008 there is a built-in function CRYPT_GEN_RANDOM.

I will describe the last method in detail, because I think that it is a very good solution for SQL Server 2008 and above. CRYPT_GEN_RANDOM is called for each row of the result set, as opposed to RAND, which is called only once.

CRYPT_GEN_RANDOM (Transact-SQL)

Returns a cryptographic random number generated by the Crypto API (CAPI). The output is a hexadecimal number of the specified number of bytes.

Besides, CRYPT_GEN_RANDOM should provide much better random values, than RAND. Better in terms of distribution and crypto-strength. Example:

(CAST(CRYPT_GEN_RANDOM(4) as int) / 4294967295.0 + 0.5)

This generates 4 random bytes as varbinary. We have to explicitly cast them to int first. Then result is transformed into a float number between 0 and 1.

So, the original query would like this:

SELECT ID AS [ID]
     , MyFunction.dbo.AddWorkDays(14, S.CREATED_DATE) AS [New Date]
     , CONVERT(VARCHAR, DATEADD(MILLISECOND, 
     CAST(43200000 * (CAST(CRYPT_GEN_RANDOM(4) as int) / 4294967295.0 + 0.5) as int),
     CONVERT(TIME, '08:00')), 114) AS [New Time]
FROM RandomTable

Here is a stand-alone example that is easy to copy-paste and try (I used the query from another answer by @Steve Ford):

SELECT DATEADD(millisecond, 
    CAST(43200000 * (CAST(CRYPT_GEN_RANDOM(4) as int) / 4294967295.0 + 0.5) as int), 
    CAST('08:00:00' AS Time)) AS [RandomTime]
FROM 
    ( VALUES (1), (2), (3), (4), (5)
    ) Y(A)
    CROSS JOIN
    ( VALUES (1), (2), (3), (4), (5)
    ) Z(A)

This is the result:

RandomTime
10:58:24.7200000
19:40:06.7220000
11:04:29.0530000
08:57:31.6130000
15:03:14.9470000
09:15:34.9380000
13:46:43.1250000
11:27:00.8940000
14:42:23.6100000
15:07:56.2120000
11:39:09.8830000
08:16:44.3960000
14:23:38.4820000
17:28:31.7440000
16:29:31.4320000
09:09:15.0210000
12:31:09.8370000
11:23:09.8430000
15:35:45.5480000
17:42:49.3390000
08:07:05.4930000
18:17:16.2980000
11:49:08.2010000
10:20:21.7620000
15:56:58.6110000

Addition

When I read the original question I didn't think that it is really necessary to ensure that all generated random numbers are unique. I interpreted the word "different" in the question as a vague opposite to seeing the same number in each row of the result that you see when using a simple SELECT RAND(). I think that in many cases it doesn't matter if there are few colliding random numbers. In many cases it would actually be the correct behavior.

So, my understanding is that when there is a need of a sequence of unique random numbers, it is in a sense equivalent to the following task. We have a set of some values/rows, for example, a set of unique IDs or all 86400 seconds of a day or 2800 rows for a given day. We want to shuffle these values/rows. We want to rearrange these rows in a random order.

To shuffle the given set of rows we simply need to ORDER BY random numbers (these random numbers may have reasonable amount of collisions here). Random numbers could be generated by any method. Something like this:

ROW_NUMBER() OVER ([optional PARTITION BY ...] ORDER BY CRYPT_GEN_RANDOM(4)) 

or literally

SELECT ...
FROM ...
ORDER BY CRYPT_GEN_RANDOM(4)

depending on where and how it is used.

Test this :

 Declare @t table(ID int,CREATED_DATE datetime)
insert into @t values
 (1 ,  '04/26/2014'),
 (2 ,  '04/26/2014'),
 (3 ,  '04/26/2014'),
 (4 ,  '04/26/2014')

 ;WITH CTE AS
 (
   SELECT *,CONVERT(VARCHAR, DATEADD(SECOND, RAND(CAST(NEWID() AS VARBINARY)) * 43200, 
   CAST('08:00:00' AS TIME)),114) AS [New Time] FROM @t WHERE ID=1
   UNION ALL
   SELECT *,CONVERT(VARCHAR, DATEADD(SECOND, RAND(CAST(NEWID() AS VARBINARY)) * 43200, 
   CAST('08:00:00' AS TIME)), 114)  FROM @t WHERE ID>1 AND ID<=5
 )
 SELECT * FROM CTE

Here's another option that gives you a bit more control over how the time is generated. You can specify the interval between the random times. It also doesn't make use of the RAND function.

DECLARE @StartTime  VARCHAR(10) = '08:00',
        @EndTime    VARCHAR(10) = '20:00',
        @Interval   INT = 5 --(In Seconds)

WITH times AS(
    SELECT CONVERT(TIME, @StartTime) AS t
    UNION ALL
    SELECT DATEADD(SECOND, @Interval, t)
    FROM times
    WHERE t < @EndTime
)

SELECT *, 
(SELECT TOP 1 t FROM times WHERE d.Id > 0 ORDER BY NEWID())
FROM #data d
option (maxrecursion 0)

On a side note :
If you remove the WHERE clause in the subquery above (WHERE d.Id > 0), the same time value is returned for all the rows, i.e. the same problem that you started with

All,

I thought I'd share the answer to my question. I can't remember exactly where I found the details - I think it was via one of the links provided by sgeddes.

I used the following query to get a random time between 8am and 7:55pm (roughly)

SELECT convert(varchar,CONVERT(varchar, DATEADD(ms, dbo.MyRand(335 ,830) * 86400, 0), 114),114)

The MyRand function is below:

SET ANSI_NULLS ON;
GO
SET QUOTED_IDENTIFIER ON;
GO
CREATE FUNCTION dbo.myRand(@Min INT, @Max INT) RETURNS decimal(18,15) AS
BEGIN
DECLARE @BinaryFloat BINARY(8)
SELECT @BinaryFloat = CAST(Id AS BINARY) FROM vwGuid

DECLARE
@PartValue TINYINT,
@Mask TINYINT,
@Mantissa FLOAT,
@Exponent SMALLINT,
@Bit TINYINT,
@Ln2 FLOAT,
@BigValue BIGINT,
@RandomNumber FLOAT

SELECT
@Mantissa = 1,
@Bit = 1,
@Ln2 = LOG(2),
@BigValue = CAST(@BinaryFloat AS BIGINT),
@Exponent = (@BigValue & 0x7ff0000000000000) / EXP(52 * @Ln2)

WHILE @Part <= 8
BEGIN
SELECT
@PartValue = CAST(SUBSTRING(@BinaryFloat, @Part, 1) AS TINYINT),
@Mask =

WHILE @Mask > 0
BEGIN
IF @PartValue & @Mask > 0
SET @Mantissa = @Mantissa + EXP(-@Bit * @Ln2)

SELECT
@Mask = @Mask / 2
END
END

SET @RandomNumber = CASE @Exponent WHEN 0 THEN 0 ELSE CAST(@Exponent AS FLOAT) / 2047 END

RETURN CAST((@RandomNumber * (@Max - @Min)) + @Min AS DECIMAL(18,15))

END
GO
END

I hope this helps. I haven't read many of the replies above so apologies if someone has a better answer - this is simply how I solved it.

Thanks

受限制的 HTML

  • 允许的HTML标签:<a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • 自动断行和分段。
  • 网页和电子邮件地址自动转换为链接。

相关推荐
  • 选择随机行PostgreSQL的最佳方法(Best way to select random rows PostgreSQL)
    问题 我想要在PostgreSQL中随机选择行,我尝试了以下方法: select * from table where random() < 0.01; 但是其他一些建议: select * from table order by random() limit 1000; 我有一个非常大的表,其中有5亿行,我希望它能快速运行。 哪种方法更好? 有什么区别? 选择随机行的最佳方法是什么? 回答1 根据您的要求(加上注释中的其他信息), 您有一个数字ID列(整数),其中只有很少(或略少)的间隔。 显然没有或只有很少的写操作。 您的ID列必须建立索引! 主键很好用。 下面的查询不需要大表的顺序扫描,只需要索引扫描。 首先,获取主要查询的估算值: SELECT count(*) AS ct -- optional , min(id) AS min_id , max(id) AS max_id , max(id) - min(id) AS id_span FROM big; 唯一可能昂贵的部分是count(*) (用于大型表)。 鉴于上述规格,您不需要它。 估算就可以了,几乎可以免费使用(此处有详细说明): SELECT reltuples AS ct FROM pg_class WHERE oid = 'schema_name.big'::regclass; 只要ct不小于多少id
  • Randomize setInterval ( How to rewrite same random after random interval)
    I'd like to know how to achieve: generate a random number after a random number of time. And reuse it. function doSomething(){ // ... do something..... } var rand = 300; // initial rand time i = setinterval(function(){ doSomething(); rand = Math.round(Math.random()*(3000-500))+500; // generate new time (between 3sec and 500"s) }, rand); And do it repeatedly. So far I was able to generate a random interval, but it last the same until the page was refreshed (generating than a different time- interval). Thanks
  • 随机浮点数生成(Random float number generation)
    问题 如何在C ++中生成随机浮点数? 我以为我可以将整数兰特除以某物,这样就足够了吗? 回答1 rand()可用于在C ++中生成伪随机数。 结合RAND_MAX和一些数学运算,您可以在选择的任意间隔内生成随机数。 这足以用于学习目的和玩具程序。 如果您需要具有正态分布的真正随机数,则需要采用更高级的方法。 这将生成一个从0.0到1.0(含)的数字。 float r = static_cast <float> (rand()) / static_cast <float> (RAND_MAX); 这将生成一个从0.0到任意float X : float r2 = static_cast <float> (rand()) / (static_cast <float> (RAND_MAX/X)); 这将生成一个从某个任意LO到某个任意HI : float r3 = LO + static_cast <float> (rand()) /( static_cast <float> (RAND_MAX/(HI-LO))); 请注意,如果您需要真正的随机数,则rand()函数通常会不够用。 在调用rand()之前,必须先通过调用srand() “播种”随机数生成器。 这应该在程序运行期间执行一次,而不是每次调用rand() 。 通常这样做是这样的: srand (static_cast
  • Generating m distinct random numbers in the range [0..n-1]
    I have two methods of generating m distinct random numbers in the range [0..n-1] Method 1: //C++-ish pseudocode int result[m]; for(i = 0; i < m; ++i) { int r; do { r = rand()%n; }while(r is found in result array at indices from 0 to i) result[i] = r; } Method 2: //C++-ish pseudocode int arr[n]; for(int i = 0; i < n; ++i) arr[i] = i; random_shuffle(arr, arr+n); result = first m elements in arr; The first method is more efficient when n is much larger than m, whereas the second is more efficient otherwise. But "much larger" isn't that strict a notion, is it? :) Question: What formula of n and m
  • C ++ 11中的随机数生成:如何生成,它如何工作? [关闭](Random number generation in C++11: how to generate, how does it work? [closed])
    问题 在这里很难说出要问什么。 这个问题是模棱两可,含糊,不完整,过于宽泛或夸张的,因此不能以目前的形式合理地回答。 如需帮助澄清此问题以便可以重新打开,请访问帮助中心。 9年前关闭。 最近,我遇到了一种在C ++ 11中生成随机数的新方法,但是无法消化我所读到的论文(这是什么引擎,数学术语,如“分布” ,即“产生的所有整数同等可能”)。 所以任何人都可以解释一下 这些是什么? 他们是什么意思? 怎么产生的? 他们如何工作? ETC 您可以在一个有关随机数生成的常见问题解答中将其全部调用。 回答1 这个问题太宽泛,无法给出完整的答案,但让我挑几个有趣的观点: 为什么“同样可能” 假设您有一个简单的随机数生成器,该生成器以相等的概率生成数字0、1,...,10(将其视为经典的rand() )。 现在,您需要0、1、2范围内的随机数,每个概率均等。 您的下意识反应是采用rand() % 3 。 但是,等等,余数0和1比余数2更频繁地出现,所以这是不正确的! 这就是为什么我们需要适当的分布,该分布需要一个统一的随机整数源并将其转变为我们想要的分布,例如示例中的Uniform[0,2] 。 最好把它留给一个好的图书馆! 引擎 因此,所有随机性的核心是一个好的伪随机数生成器,该生成器生成一个在一定间隔内均匀分布的数字序列,理想情况下,其周期很长。 rand()的标准实现通常不是最好的
  • PostgreSQL:“按分钟”运行查询的行数(PostgreSQL: running count of rows for a query 'by minute')
    问题 我需要查询每一分钟到该分钟为止的总行数。 到目前为止,我能取得的最好成绩并不能解决问题。 它返回每分钟的计数,而不是每分钟的总计数: SELECT COUNT(id) AS count , EXTRACT(hour from "when") AS hour , EXTRACT(minute from "when") AS minute FROM mytable GROUP BY hour, minute 回答1 仅返回活动的分钟数 最短的 SELECT DISTINCT date_trunc('minute', "when") AS minute , count(*) OVER (ORDER BY date_trunc('minute', "when")) AS running_ct FROM mytable ORDER BY 1; 使用date_trunc(),它将完全返回您需要的内容。 不包括id查询,因为你要GROUP BY分钟片。 count()通常用作纯聚合函数。 附加OVER子句使其成为窗口函数。 在窗口定义中省略PARTITION BY您希望对所有行进行运行计数。 默认情况下,从ORDER BY定义的值开始算起,从当前行的第一行到最后一行。 我引用该手册: 默认的成帧选项是“ RANGE UNBOUNDED PRECEDING ,与“ RANGE BETWEEN
  • 在恒定空间中创建1..N的随机排列(Create a random permutation of 1..N in constant space)
    问题 我想枚举固定空间中数字1..N的随机排列。 这意味着我无法将所有数字存储在列表中。 这样做的原因是N可能非常大,超过了可用内存。 我仍然希望能够一次遍历一个数字的排列,每次访问每个数字正好一次。 我知道可以对某些N做到这一点:许多随机数生成器随机(但完全)在整个状态空间中循环。 一个状态大小为32位的良好随机数生成器将发出数字0 ..(2 ^ 32)-1的排列。 每个数字恰好一次。 我想选择N完全是任何数字,例如,不限于2的幂。 是否有针对此的算法? 回答1 最简单的方法可能是只为比您关心的范围更大的范围创建一个全范围PRNG,并且当它生成的数字大于您想要的数量时,将其丢弃并得到下一个。 几乎完全相同的另一种可能性是首先使用线性反馈移位寄存器(LFSR)生成数字。 这有两个优点:首先,LFSR可能比大多数PRNG快一点。 其次,设计LFSR产生接近所需范围的数字要容易一些(我相信),并且仍然要确保它以(伪)随机顺序循环遍历其范围内的数字,而不会重复。 无需花费大量时间在细节上,对LFSR背后的数学进行了相当全面的研究。 产生不重复其范围内所有数字的数字,只需要选择一组与不可约多项式相对应的“抽头”即可。 如果您不想自己进行搜索,则可以很容易地找到几乎任何合理大小的已知表(例如,快速浏览一下,维基百科文章列出了不超过19位的表)。 如果有内存可用,则至少有一个不可约的多项式
  • Given a number, produce another random number that is the same every time and distinct from all other results
    Basically, I would like help designing an algorithm that takes a given number, and returns a random number that is unrelated to the first number. The stipulations being that a) the given output number will always be the same for a similar input number, and b) within a certain range (ex. 1-100), all output numbers are distinct. ie., no two different input numbers under 100 will give the same output number. I know it's easy to do by creating an ordered list of numbers, shuffling them randomly, and then returning the input's index. But I want to know if it can be done without any caching at all
  • 如何使用C#生成真正的(不是伪的)随机数?(How can I generate truly (not pseudo) random numbers with C#?)
    问题 我知道Random类可以生成伪随机数,但是有没有一种方法可以生成真正的随机数? 回答1 这里的答案有两个主要方面。 您应该适当注意一些非常重要的细节。 简易方法(出于简单性和实用性) RNGCryptoServiceProvider是BCL中Crypto API的一部分,应为您完成这项工作。 从技术上讲,它仍然是一个伪随机数,但是“随机性”的质量要高得多-顾名思义,它适合于加密目的。 还有其他具有高质量伪随机生成器的低温API。 诸如Mersenne扭曲器之类的算法非常流行。 与BCL中的Random类相比,它要好得多。 例如,如果在图表上绘制Random数生成的数字,则您应该能够识别模式,这是弱点的有力标志。 这主要是由于该算法仅使用固定大小的种子查找表。 艰难的道路(用于高质量的理论随机性) 要生成真正的随机数,您需要利用一些自然现象,例如核衰变,微观温度波动(CPU温度是比较方便的来源),仅举几例。 但是,这要困难得多,并且当然需要附加的硬件。 我怀疑实际的解决方案( RNGCryptoServiceProvider等)应该为您很好地完成工作。 现在,请注意,如果确实需要真正的随机数,则可以使用Random.org这样的服务,该服务会生成具有非常高的随机性/熵(基于大气噪声)的数字。 数据可免费下载。 尽管这确实为您提供了适合科学研究的数据,但对于您的情况
  • 生成随机位-C rand()中缺乏随机性(Generating a random bit - lack of randomness in C rand())
    问题 我正在使用rand()生成0或1( rand() % 2 )。 我正在使用当前时间( srand(time(NULL)) )进行播种。 经过大量调试后,我意识到rand()绝不会连续返回16次或更多的偶数(奇数)。 这是一个已知的问题? C随附更好的PRNG吗? 我正在使用Visual Studio 2010在Windows 7上运行。 回答1 而不是使用rand()%2 ,请尝试rand()>(RAND_MAX/2) 。 您只能假设rand()在间隔[0, RAND_MAX]上是统一的。 编辑:这是Shahbaz在评论中建议的,只有在发布此答案后我才注意到。 编辑: ArjunShankar在我以前的措辞中叫我出来:“仅在间隔[0,RAND_MAX]上将rand()指定为统一的” 根据C99标准: rand函数计算范围为0到RAND_MAX的伪随机整数序列。 从技术上讲,没有规定均匀性(或均分),而是用于实现常用PRNG(例如Mersenne Twister)的实际标准。 这是为了允许程序员轻松创建具有非均匀分布的自定义PRNG。 没有此属性,程序员将不得不从头开始实现自定义PRNG。 回答2 我建议使用更好的RNG。 您正在Windows上运行,因此可以使用rand_s:这是使用Windows加密RNG的Microsoft扩展。 回答3 众所周知, rand()很烂。
  • 如何正确播种随机数生成器(How to properly seed random number generator)
    问题 我正在尝试在Go中生成一个随机字符串,这是我到目前为止编写的代码: package main import ( "bytes" "fmt" "math/rand" "time" ) func main() { fmt.Println(randomString(10)) } func randomString(l int) string { var result bytes.Buffer var temp string for i := 0; i < l; { if string(randInt(65, 90)) != temp { temp = string(randInt(65, 90)) result.WriteString(temp) i++ } } return result.String() } func randInt(min int, max int) int { rand.Seed(time.Now().UTC().UnixNano()) return min + rand.Intn(max-min) } 我的执行速度很慢。 使用time播种会在一定时间time带来相同的随机数,因此循环会一次又一次地迭代。 如何改善我的代码? 回答1 每次设置相同的种子,您将获得相同的序列。 因此,当然,如果您将种子设置为快速循环中的时间,则可能会多次调用相同的种子。
  • Java生成非重复随机数(Java generating non-repeating random numbers)
    问题 我想创建一组随机数,而Java中没有重复项。 例如,我有一个数组来存储10,000个从0到9999的随机整数。 这是我到目前为止的内容: import java.util.Random; public class Sort{ public static void main(String[] args){ int[] nums = new int[10000]; Random randomGenerator = new Random(); for (int i = 0; i < nums.length; ++i){ nums[i] = randomGenerator.nextInt(10000); } } } 但是上面的代码创建了重复项。 如何确保随机数不重复? 回答1 Integer[] arr = {...}; Collections.shuffle(Arrays.asList(arr)); 例如: public static void main(String[] args) { Integer[] arr = new Integer[1000]; for (int i = 0; i < arr.length; i++) { arr[i] = i; } Collections.shuffle(Arrays.asList(arr)); System.out.println
  • 生成具有给定(数字)分布的随机数(Generate random numbers with a given (numerical) distribution)
    问题 我有一个具有不同值的概率的文件,例如: 1 0.1 2 0.05 3 0.05 4 0.2 5 0.4 6 0.2 我想使用此分布生成随机数。 是否存在处理此问题的现有模块? 自己编写代码是很简单的(构建累积密度函数,生成随机值[0,1]并选择相应的值),但看来这应该是一个常见问题,并且可能有人为它创建了一个函数/模块它。 我需要这个,因为我想生成一个生日列表(不遵循标准random模块中的任何分布)。 回答1 scipy.stats.rv_discrete可能就是您想要的。 您可以通过values参数提供概率。 然后,您可以使用分发对象的rvs()方法生成随机数。 正如Eugene Pakhomov在评论中指出的那样,您还可以将p关键字参数传递给numpy.random.choice(),例如 numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2]) 如果您使用的是Python 3.6或更高版本,则可以使用标准库中的random.choices() –请参阅Mark Dickinson的答案。 回答2 从Python 3.6开始,Python的标准库中提供了一个解决方案,即random.choices。 用法示例:让我们设置与OP问题中的种群和权重匹配的种群和权重: >>>
  • 在固定时间间隔后如何重复执行异步任务(How to execute Async task repeatedly after fixed time intervals)
    问题 如何使Async任务在某个时间间隔后像Timer一样反复执行...实际上,我正在开发一个应用程序,该应用程序将自动从服务器下载所有最新的未读问候,为此,我必须在修复某些问题后从服务器检查更新时间间隔....我知道可以通过计时器轻松完成,但是我想使用异步任务,我认为这对于android应用程序更有效。 回答1 public void callAsynchronousTask() { final Handler handler = new Handler(); Timer timer = new Timer(); TimerTask doAsynchronousTask = new TimerTask() { @Override public void run() { handler.post(new Runnable() { public void run() { try { PerformBackgroundTask performBackgroundTask = new PerformBackgroundTask(); // PerformBackgroundTask this class is the class that extends AsynchTask performBackgroundTask.execute(); } catch (Exception e) {
  • Generate a random point within a circle (uniformly)
    I need to generate a uniformly random point within a circle of radius R. I realize that by just picking a uniformly random angle in the interval [0 ... 2π), and uniformly random radius in the interval (0 ... R) I would end up with more points towards the center, since for two given radii, the points in the smaller radius will be closer to each other than for the points in the larger radius. I found a blog entry on this over here but I don't understand his reasoning. I suppose it is correct, but I would really like to understand from where he gets (2/R2)×r and how he derives the final solution
  • 加权随机数(Weighted random numbers)
    问题 我正在尝试实现加权随机数。 我目前只是把头撞在墙上,无法解决这个问题。 在我的项目(Hold'em手范围,主观全能分析)中,我正在使用Boost的随机函数。 因此,假设我要选择1到3之间的一个随机数(所以选择1、2或3)。 Boost的mersenne扭曲生成器为此起了很大的作用。 但是,我希望例如这样对选秀权进行加权: 1 (weight: 90) 2 (weight: 56) 3 (weight: 4) Boost是否为此具有某种功能? 回答1 有一个简单的算法可以随机选择一个项目,其中项目具有各自的权重: 1)计算所有权重之和 2)选择一个大于或等于0且小于权重总和的随机数 3)一次检查一个项目,从您的随机数中减去它们的权重,直到获得随机数小于该项目权重的项目 伪代码说明了这一点: int sum_of_weight = 0; for(int i=0; i<num_choices; i++) { sum_of_weight += choice_weight[i]; } int rnd = random(sum_of_weight); for(int i=0; i<num_choices; i++) { if(rnd < choice_weight[i]) return i; rnd -= choice_weight[i]; } assert(!"should never
  • 如何在C ++中生成一个随机数?(How to generate a random number in C++?)
    问题 我正在尝试制作带有骰子的游戏,并且需要在其中包含随机数(以模拟骰子的侧面。我知道如何在1到6之间进行游戏)。 使用 #include <cstdlib> #include <ctime> #include <iostream> using namespace std; int main() { srand((unsigned)time(0)); int i; i = (rand()%6)+1; cout << i << "\n"; } 效果不是很好,因为当我几次运行程序时,得到的输出是: 6 1 1 1 1 1 2 2 2 2 5 2 所以我想要一个命令,每次都会生成一个不同的随机数,而不是连续生成5次相同的随机数。 有命令可以做到这一点吗? 回答1 测试应用程序的最根本的问题是,你调用srand一次,然后调用rand一次并退出。 srand函数的全部要点是使用随机种子初始化伪随机数序列。 这意味着,如果在两个不同的应用程序(具有相同的srand / rand实现)中将相同的值传递给srand ,则在这两个应用程序中读取的rand()值序列将完全相同。 但是,在您的示例应用程序中,伪随机序列仅包含一个元素-从种子生成的伪随机序列的第一个元素等于1 sec精度的当前时间。 您期望在输出中看到什么? 显然,当您碰巧在同一秒上运行应用程序时-您使用了相同的种子值
  • <random> generates same number in Linux, but not in Windows
    The code below is meant to generate a list of five pseudo-random numbers in the interval [1,100]. I seed the default_random_engine with time(0), which returns the system time in unix time. When I compile and run this program on Windows 7 using Microsoft Visual Studio 2013, it works as expected (see below). When I do so in Arch Linux with the g++ compiler, however, it behaves strangely. In Linux, 5 numbers will be generated each time. The last 4 numbers will be different on each execution (as will often be the case), but the first number will stay the same. Example output from 5 executions on
  • 生成一个圆内的随机点(均匀)(Generate a random point within a circle (uniformly))
    问题 我需要在半径R的圆内生成一个均匀随机的点。 我意识到,只要在间隔[0 ...2π)中选择一个均匀的随机角度,在间隔(0 ... R )中选择一个均匀的随机半径,就可以得到指向中心的更多点,因为对于给定的两个点半径,半径较小的点将比半径较大的点彼此靠近。 我在这里找到了有关此内容的博客条目,但我不理解他的理由。 我想这是正确的,但我真的很想了解他从哪里得到(2 / R 2 )× r以及他如何得出最终解。 更新:发布此问题7年后,关于平方根算法背后的数学问题,我仍然没有收到关于实际问题的满意答案。 所以我花了一天时间自己写答案。 链接到我的答案。 回答1 让我们像阿基米德一样处理这个问题。 如何在| AB | = | BC |的三角形ABC中均匀地生成一个点? 让我们通过扩展到平行四边形ABCD来使此过程变得更容易。 在ABCD中均匀地生成点很容易。 我们统一选择AB上的随机点X和BC上的Y,然后选择Z,使XBYZ为平行四边形。 为了在原始三角形中获得一致选择的点,我们只需将ADC中出现的所有点沿AC折回到ABC。 现在考虑一个圆圈。 在极限情况下,我们可以认为它是无限多个等腰三角形ABC,其原点为B,圆周上的A和C几乎彼此消失。 我们可以简单地通过选择角度theta来选择这些三角形之一。 因此,我们现在需要通过在条状ABC中选取一个点来生成与中心的距离。 同样,扩展到ABCD
  • Matplotlib:如何有效地将大量线段着色为独立的渐变(Matplotlib: How to colorize a large number of line segments as independent gradients, efficiently)
    问题 Python。 matplotlib :如何有效地将大量线段着色为独立的渐变? 已经阅读了本,本以及其他内容; 他们都不是我们的答案! 我们有许多单独的线希望以渐变颜色绘制每条线。 如果您有多个字符串行,则上面第一个链接中提到的解决方案将不起作用。 换句话说,更改颜色循环会影响绘图中的所有内容,而不仅仅是唯一的关注线。 这根本与我们无关。 指向matplotlib站点的第二个链接使用将每一行分割成许多行。 这不是一个好方法,因为对于大量的行(例如10000甚至更多)而言; 即使每行仅选择10个细分,结果也太大了! 即使这样,最终产生的线条也不会完全平滑地着色! 如果将分割数作为线段的函数以获得更好的渐变,则结果将非常巨大! 难以显示,难以正确保存为文件。 回答1 一个(次要)加速将添加一个单独的行集合,而不是10000个单独的行集合。 只要所有线共享相同的颜色图,就可以将它们分组为一个线集合,并且每个线仍可以具有独立的渐变。 Matplotlib对于这种事情仍然很慢。 它针对质量输出进行了优化,而不是缩短绘制时间。 但是,您可以加快速度(〜3倍)。 因此,作为我认为您现在正在(?)进行操作的一个示例: import numpy as np import matplotlib.pyplot as plt from matplotlib.collections import